We hope you have a wonderful holiday season and are able to join People Network calls in January! (For handy calendar entries, see the CaRCC Events Calendar.)
Data-Facing Track (first Tuesdays)
Tuesday, January 5, 1p ET/ 12p CT/ 11a MT/ 10a PT
Python for Big Data
Presenter: Bala Desinghu, Rutgers University
Python is a popular programming language for developing software and data science applications. Its popularity stems from many factors such as simplicity, readability, portability, etc. As such, Python is slow compared to C or Fortran and it does not manage memory well. These limitations in speed and memory management may not be significant when analyzing small data sets, but it becomes a bottleneck when analyzing big data sets. Techniques based on vectorization, parallelization, just in time compilation, and distributed task executions have been widely adopted by the Python community to address these challenges associated with big data. This presentation will address a few techniques suitable for large scale data analysis and answer the following questions: What to do when the data set size exceeds the available physical memory? How to speed up the data analysis? How to distribute the workloads when doing machine learning for big data sets?
Researcher-Facing Track (second Thursdays)
Thursday, January 14, 1p ET/ 12p CT/ 11a MT/ 10a PT
All about CaRCC (… beyond the R-F Track)
Presenters:
Tom Cheatham, University of Utah
Lauren Michael, University of Wisconsin
Dana Brunson, Internet2
Patrick Schmitz, Semper Cogito Consulting