All Seminars
Title: Cracking the diversity code: Understanding computing pathways of those least represented in order to foster their representation |
---|
Seminar: Computer Science |
Speaker: Monique Ross, Florida International University |
Contact: Vaidy Sunderam, VSS@emory.edu |
Date: 2021-11-12 at 1:00PM |
Venue: https://emory.zoom.us/j/98352727203 |
Abstract: Abstract: A significant gap exists in the understanding of factors that influence the participation of Black and Hispanic women in computer science. The objective is to listen to those often unheard in the conversation around broadening participation in computer science, in order to critically examine efforts and initiatives that impact engagement. This talk will describe the journey towards this objective and preliminary results. The outcomes of this work have the potential to reshape the community’s perceptions of what and who are computer scientists as well as crack the code to diversifying this lucrative and impactful discipline. Biography: Monique Ross earned a doctoral degree in Engineering Education from Purdue University. She has a Bachelor’s degree in Computer Engineering from Elizabethtown College, a Master’s degree in Computer Science and Software Engineering from Auburn University, eleven years of experience in the industry as a software engineer, and five years as a full-time faculty in the departments of computer science and engineering. Her interests focus on broadening participation in computer science through the exploration of: 1) race, gender, and identity; 2) discipline-based education research (with a focus on computer science courses) in order to better inform pedagogical practices that garner interest and retain women and minorities in computer-related fields. She is the PI on three National Science Foundation grants, one foundation grant, and co-PI on two large scale grants. Dr. Monique Ross is committed to the expansion of rigorous computer science education research at FIU and nationally. https://www.cis.fiu.edu/faculty-staff/ross-monique/ **Join Zoom Meeting** Venue: https://emory.zoom.us/j/98352727203 |
Title: Deep Learning with Differential Privacy and Adversarial Robustness |
---|
Defense: Computer Science |
Speaker: Pengfei Tang, Emory University |
Contact: Dr. Li Xiong, lxiong@emory.edu |
Date: 2021-11-11 at 3:00PM |
Venue: https://us02web.zoom.us/j/7382282740?pwd=QVB4bmU2NnlZN2s1UW0veUtCNklmUT09 |
Abstract: Deep learning models have been increasingly powerful on different tasks, such as image classification and data synthesization. However, there are two major vulnerabilities existing: 1) privacy leakage of the training data through inference attacks, and 2) adversarial examples that are crafted to trick the classifier to misclassify. Differential privacy (DP) is a popular technique to prevent privacy leakage, which offers a provable guarantee on privacy of training data through randomized mechanisms such as gradient perturbation. For attacks of adversarial examples, there are two categories of defense: empirical and theoretical approaches. Adversarial training is one of the most popular empirical approaches, which injects adversarial examples with correct labels to the training dataset and renders the model robust through optimization. Certified robustness is a representative of theoretical approaches, which offers a theoretical guarantee to defend against adversarial examples through randomized mechanisms such as input perturbation. However, there are some limitations in existing works that reduce the effectiveness of these approaches. For DP, one challenge is the contradiction between a better utility performance and a certain level of privacy guarantee. For adversarial training, one challenge is that when the types of adversarial examples are limited, the model robustness is confined. For certified robustness, existing works fail to exploit the connection between input and gradient perturbation, which wastes a part of randomization during training. To solve these limitations, 1) we propose a novel framework IGAMT for data synthesization. Compared with traditional frameworks, IGAMT adds less gradient perturbation to guarantee DP, but still keeps the complex architecture of generative models to achieve high utility performance. 2) We propose a distance constrained Adversarial Imitation Network (AIN) for generating adversarial examples. We prove that compared with traditional adversarial training, adversarial training with examples from AIN can achieve comparable or better model robustness. 3) We propose a new framework TransDenoiser to achieve both DP and certified robustness, which utilizes all randomization during training and saves the privacy budget for DP. |
Title: Measurement and Analysis Methods of Performance Problems in Distributed Systems |
---|
Defense: Computer Science |
Speaker: Lei Zhang, Emory University |
Contact: Ymir Vigfusson, ymir@mathcs.emory.edu |
Date: 2021-11-08 at 12:00PM |
Venue: https://emory.zoom.us/j/94559953414 |
Abstract: Today's distributed systems invest significant computational and storage resources to accommodate their large scale of data, but more resources does not automatically improve performance. To deliver high performance, new types of large-scale solutions, such as the cloud computing and microservices paradigms, follow the design of deploying loosely coupled components that perform but, in the process, making it harder to maintain a global view of system performance. The ensuing growing complexity of system architectures, diagnosing and understanding performance problems has become both critically important and highly challenging. The aim of my thesis is to fill in some missing but significant parts towards monitoring and analyzing performance problems in distributed system, by asking the question: What is the performance bottleneck of distributed systems performance, and how should we improve it? First, my thesis proposes a novel retroactive tracing abstraction where full telemetry information about a distributed request can be retrieved ``back in time'' soon after a problem is detected without unduly burdening any node in the system, with an always-on distributed tracing system. Second, my thesis frames the challenges of data placement in modern memory hierarchies in a generalized paging model outside of traditional assumptions, and provides an offline data placement algorithm towards optimal placement decisions. Last, my thesis derives a rule-of-thumb expression for cache warmup times, specifically how long caches in storage systems and CDNs need to be warmed up before their performance is deemed to be stable. |
Title: Fairness-Aware Predictive Modeling of Human Event Data |
---|
Seminar: Computer Science |
Speaker: Dr. Mingxuan Sun, Louisiana State University |
Contact: Joyce Ho, joyce.c.ho@emory.edu |
Date: 2021-11-05 at 1:00PM |
Venue: https://emory.zoom.us/j/98352727203 |
Abstract: Large volumes of human event data, such as online TV viewing records, disaster rescue requests, and electronic records of hospital admissions, are becoming increasingly available in a wide variety of applications including social network analysis, smart cities, and healthcare analytics. Predictive modeling of those collective event sequences is beneficial for improving event response efficiency and promoting nationwide economic development. Although current machine learning algorithms can achieve significant event prediction accuracy, the historic data or the self-excitation property can introduce biased prediction. In this talk, we introduce a series of novel models and algorithms to analyze human events to balance between prediction accuracy and fairness. Specifically, we investigate point processes and deep learning methods to improve event prediction accuracy. Furthermore, we introduce a fairness metric that can efficiently evaluate the ranking fairness in event prediction and use the metric to penalize the event likelihood function and to strike a balance between accuracy and fair loss. Biography: I am an Associate Professor in the Division of Computer Science and Engineering in the School of Electrical Engineering and Computer Science at Louisiana State University. I received my Ph.D. degree in Computer Science from the Georgia Institute of Technology in 2012. I received my Master's degree in Computer Science from the University of Kentucky in 2006 and my Bachelor's degree in Computer Science and Engineering from Zhejiang University, China in 2004. I was a Senior Scientist with the playlist recommendation group, Pandora Media, Inc. from 2012 to 2015. http://csc.lsu.edu/~msun/ **Join Zoom Meeting** Venue: https://emory.zoom.us/j/98352727203 |
Title: Implicit User-Generated Content in the service of Public Health |
---|
Seminar: Computer Science |
Speaker: Dr. Evgeniy Gabrilovich, Google Health |
Contact: Eugene Agichtein, eugene.agichtein@emory.edu |
Date: 2021-10-29 at 1:00PM |
Venue: https://emory.zoom.us/j/98352727203 |
Abstract: Abstract: Every day millions of people use online products and services to satisfy their information needs. In the process of doing so, they produce large volumes of user-generated content (UGC). In this talk, we will distinguish between "explicit" UGC, which is intended to be made public (such as product ratings or reviews), and "implicit" UGC, which can be responsibly anonymized and aggregated in a privacy-preserving way to improve public health. We will analyze implicit UGC as a positive consumption externality, and will discuss its beneficial uses across a range of public health applications. The bulk of this talk will focus on methods for aggregating and classifying the data to provide timely signals that help guide public health interventions and assess their efficacy. We will discuss applications such as estimating disease incidence, outbreak prediction, mitigating pandemic spread, and improving public health messaging. Biography: Dr. Evgeniy Gabrilovich is a research director at Google Health where he leads the Public & Environmental Health team. Prior to joining Google in 2012, he was a director of research and head of the natural language processing and information retrieval group at Yahoo! Research. Evgeniy is an IEEE Fellow and ACM Distinguished Scientist. He is a recipient of the 2014 IJCAI-JAIR Best Paper Prize and the 2010 Karen Sparck Jones Award for his contributions to natural language processing and information retrieval. Evgeniy has served as a technical program chair for WSDM 2021, WWW 2017, and WSDM 2015. He earned his PhD in computer science from the Technion - Israel Institute of Technology. He also graduated (with extra credit) from the Executive MD training program at Harvard Medical School. **Join Zoom Meeting** Venue: https://emory.zoom.us/j/98352727203 |
Title: Data Collection for Data-Centric AI |
---|
Seminar: Computer Science |
Speaker: Dr. Fatemeh Nargesian, University of Rochester |
Contact: Vaidy Sunderam, vss@emory.edu |
Date: 2021-10-22 at 1:00PM |
Venue: https://emory.zoom.us/j/98352727203 |
Abstract: Abstract: The holy grail of data-centric AI is to collect high-quality labeled data sets for the purpose of training ML models. Data collection has become an active area of research in the data management community due to the importance of handling large amounts of training data. This talk will examine the data collection techniques that can be used to discover, augment, or generate datasets from existing data lakes. I will also cover data tailoring that is to ensure that the collected data set for analysis has an appropriate representation of relevant (demographic) groups: it meets desired distribution requirements. I will conclude by introducing some of the interesting research challenges that remain in the data collection landscape. Biography: Fatemeh Nargesian is an assistant professor in the Department of Computer Science, at the University of Rochester. She got her PhD at the University of Toronto and was a research intern at IBM Watson. Before the University of Toronto, she worked at Clinical Health and Informatics Group at McGill University. Her primary research interests are in data intelligence focused ondata discovery, data, integration, and data for ML. **Join Zoom Meeting** Venue: https://emory.zoom.us/j/98352727203 |
Title: Computer Science in the Field |
---|
Seminar: Computer Science |
Speaker: Dr. Andreas Züfle, George Mason University |
Contact: Vaidy Sunderam, VSS@emory.edu |
Date: 2021-10-15 at 1:00PM |
Venue: https://emory.zoom.us/j/98352727203 |
Abstract: We computer scientists find solutions to problems. The impact of our solutions is limited by the importance of the problems that we solve. Since finishing my Ph.D. in 2013, I've been searching for good problems: problems that are challenging, leverage large amounts of data, and, if solved, have broad impacts for society. This presentation shows some of the problems that I've found during my tenure at George Mason University, including 1) my work in collaboration with transportation authorities in NSF's "Algorithms in the Field (AitF)" program to improve traffic conditions, 2) my work in collaboration with social scientists in DARPA's "Ground Truth" program to improve understanding of human behavior, 3) my work in collaboration with geographers in IARPA's "Space-based Machine Automated Recognition Technique (SMART)" program to analyze satellite images, and 4) my work in collaboration with epidemiologists in NSF's "Ecology and Evolution of Infectious Diseases (EEID)" program to understand and predict diseases and prevent pandemics. For each of the problems, I will briefly showcase some solutions that my students, collaborators, and I have found, and describe visions and directions for future work and potential collaboration. Biography: Andreas Züfle is a German computer scientist and associate professor at the Department of Geography and Geoinformation Science at George Mason University (GMU). He received his Ph.D. in Computer Science, summa cum laude, under supervision of Dr. Hans-Peter Kriegel at Ludwig Maximilan University of Munich, Germany (LMU) in 2013. Andreas' research focuses on data management. He is mainly known for his contribution to the field of geospatial data management and mining. In this area, he has made contributions in several subareas, notably: uncertain data management, spatial indexing, clustering, and geosimulation. He collaborates closely with experts in geosciences, transportation, epidemiology, and social science to leverage computer science for interdisciplinary applications having broader impact. Since starting at GMU in 2016, his research has received more than $5,000,000 in funding from NSF, DARPA, and IARPA. He is the author of more than 100 peer-reviewed articles and has an h-index of 20. **Join Zoom Meeting** Venue: https://emory.zoom.us/j/98352727203 |
Title: Enabling Urban Intelligence by Harnessing Human-Generated Spatial-Temporal Data |
---|
Seminar: Computer Science |
Speaker: Dr. Yanhua Li, Worcester Polytechnic Institute (WPI) |
Contact: Liang Zhao, Liang.Zhao@emory.edu |
Date: 2021-10-08 at 1:00PM |
Venue: https://emory.zoom.us/j/98352727203 |
Abstract: The rapid development of mobile sensing and information technology has led to an explosive growth in both the amount and the scale of human-generated spatial-temporal data (HSTD). Examples of HSTD include taxi GPS trajectories, passenger trip data from automated fare collection (AFC) devices on buses and trains, and working traces from the emerging gig-economy services, such as food delivery (DoorDash, Postmates), and everyday tasks (TaskRabbit). Such HSTD capture unique decision-making strategies of the human agents (e.g., the passenger-seeking strategies of taxi drivers and transit mode choice strategies of travelers). Harnessing HSTD to characterize the unique decision-making strategies of human agents has transformative potential in many applications, including promoting individual well-being of gig-workers and improving the service quality and revenue of transportation service providers. In this talk, I will first introduce our spatial-temporal imitation learning framework that inversely learns and “imitates” the decision-making strategies of human agents from their HSTD. Moreover, I will present how to use the learned human decision strategies to enable human-centric urban intelligence that enhances the well-being and fairness for urban dwellers and society in terms of income level, travel and living convenience. Biography: Yanhua Li is an Associate Professor in the Computer Science Department and Data Science Program at Worcester Polytechnic Institute (WPI). His research interests focus on artificial intelligence (AI) and data science, with applications in smart cities in many contexts, including spatial-temporal data analytics, urban planning and optimization. Recently, his research has an emphasis on advancing imitation learning and meta learning in AI for learning and influencing the decision-making strategies of urban human agents, such as passenger-seeking strategies of taxi drivers and transit mode/route choices of urban travelers. Dr. Li received two Ph.D. degrees in computer science from University of Minnesota at Twin Cities in 2013, and in electrical engineering from Beijing University of Posts and Telecommunications, Beijing in China in 2009, respectively. His work has been honored with the Best Applied Data Science Paper Award at SDM 2019. His research has been funded by NSF CAREER and CRII Awards, and two projects with NSF Smart and Connected Communities (S&CC) Program. Please find more details of his work at https://users.wpi.edu/~yli15/ |
Title: Machine Learning Methods for Biomedical Keyphrase Extraction |
---|
Defense: Computer Science |
Speaker: Zelalem Gero, Emory University |
Contact: Joyce Ho, Joyce.Ho@emory.edu |
Date: 2021-10-05 at 1:00PM |
Venue: https://emory.zoom.us/j/8187241545 |
Abstract: Due to the increased generation and digitization of text documents on the Internet and digital libraries, automated methods that can improve search, discovery and mining of the vast body of literature are more essential than ever. Efficient automated methods that extract keywords to retrieve the salient concepts of a document are shown to be of a paramount importance in text analysis, document summarization, topic detection, and recommendation systems among others. One of the largest scientific databases, PubMed, contains more than 33 million citations and abstracts of biomedical literature to facilitate searching across several National Library of Medicine literature resources. The search results mainly depend on the effective indexing of the PubMed citations with MeSH (Medical Subject Headings) and author keywords. While indexing is enormously important in facilitating searching and clustering documents, automated software systems are still far behind human level performance. In this dissertation, we focused on the two tasks of indexing PubMed citations with keywords and MeSH terms. To that end, we proposed 1) an unsupervised extraction method based on phrase-embedding and modified PageRank algorithm which converges faster and performs better than related baseline methods; 2) A Sequence tagging deep learning method based on attending to words that are central to the document’s semantics; 3) A semi-supervised deep learning approach to harness vastly available unannotated biomedical data that improves keyword extraction based on uncertainty estimation. 4) A reinforcement Learning-based encoder-decoder method for MeSH indexing. |
Title: Multimodal Analysis of Healthcare Data Using Wearable Sensors and EHR |
---|
Seminar: Computer Science |
Speaker: Dr. Tanvi Banerjee, Wright State |
Contact: Joyce Ho, joyce.c.ho@emory.edu |
Date: 2021-10-01 at 1:00PM |
Venue: https://emory.zoom.us/j/98352727203 |
Abstract: In this talk, we will discuss two ongoing NIH funded projects that employ machine learning techniques to address chronic healthcare conditions. In the first project, we use wearable sensor data to measure the sleep quality in caregivers of persons with dementia as a means to assess caregiver burnout. In the second, we leverage EHR (electronic health records) notes, as well as wearable sensor data to detect pain in patients with sickle cell disease. I will discuss some of the technical challenges and contributions from these studies that focus on feature extraction, robust feature selection, and data summarization. Biography: Dr. Banerjee is an Associate Professor in the Department of Computer Science and Engineering, with a secondary appointment at the Department of Geriatrics, Boonshoft School of Medicine at Wright State University. Her research interest lies in using technology to solve healthcare challenges specifically to manage chronic conditions. Using mobile technology, she employs machine learning and signal processing techniques to assess patient symptoms remotely as well as unobtrusively. Her current research within the geriatric population includes using fitness devices for stress assessment in caregivers of patients with dementia (featured in local media sources, Alzheimer’s Association, as well as in the Research Features special issue of Women in Science, 2018). Dr. Banerjee has been awarded the NIH K01 and an additional R01 supplement (as PI) to support her work with caregiver stress in dementia patients, and is also a co-PI on two R01 projects for asthma management in children (completed), and pain assessment in patients with sickle cell disease, respectively. She is currently an associate editor for the journal IEEE Transactions on Fuzzy Systems, and serves on the program committees of workshops and conferences. Most recently, she is a moderator for TechRxiv (arXiv for IEEE Technical manuscripts) and has served on the program committee of AAAI, IEEE Big Data and reviewed for IEEE EMBC (Engineering in Medicine and Biology) and AMIA. |