We hope you have a wonderful holiday season and are able to join People Network calls in January! (For handy calendar entries, see the CaRCC Events Calendar.)
Data-Facing Track (first Tuesdays)
Tuesday, January 5, 1p ET/ 12p CT/ 11a MT/ 10a PT
Python for Big Data
Presenter: Bala Desinghu, Rutgers University
Python is a popular programming language for developing software and data science applications. Its popularity stems from many factors such as simplicity, readability, portability, etc. As such, Python is slow compared to C or Fortran and it does not manage memory well. These limitations in speed and memory management may not be significant when analyzing small data sets, but it becomes a bottleneck when analyzing big data sets. Techniques based on vectorization, parallelization, just in time compilation, and distributed task executions have been widely adopted by the Python community to address these challenges associated with big data. This presentation will address a few techniques suitable for large scale data analysis and answer the following questions: What to do when the data set size exceeds the available physical memory? How to speed up the data analysis? How to distribute the workloads when doing machine learning for big data sets?
Researcher-Facing Track (second Thursdays)
Thursday, January 14, 1p ET/ 12p CT/ 11a MT/ 10a PT
All about CaRCC (… beyond the R-F Track)
Tom Cheatham, University of Utah
Lauren Michael, University of Wisconsin
Dana Brunson, Internet2
Patrick Schmitz, Semper Cogito Consulting
The Researcher-Facing Track would like to take the opportunity to start 2021 by pausing, reflecting, and asking “Who are we?” CaRCC has grown and evolved over the past two to three years — the People Network alone currently has over 900 persons on its mailing list; and the number of working groups has approximately doubled. Many persons within CaRCC and part of the Ecosystem are hard at work “boundary spanning” — bringing groups together, sharing ideas, and helping both the RCD profession and the research & researchers we support. We will give attendees an overview & update on the organization; highlight some key, example activities and outputs of a couple Working Groups, and conclude with Engagement. This is a great time to invite new colleagues to come and learn about CaRCC, its activities, and how to Get Started or Get Involved!
Note: We plan on having a similar call topic in each track over the coming months.
Emerging Centers Track (third Wednesdays)
Wednesday, January 27, 12p ET/ 11a CT/ 10a MT/ 9a PT
Note: We have moved back our January meeting to Wed January 27 due to the Presidential Inauguration on Jan. 20. The session will be recorded and available for those who cannot attend.
Open Science Grid (OSG) – A national, distributed computing partnership for data-intensive research – and potential partner for your CC* Campus Compute proposal
Presenter: Lauren Michael
The Emerging Centers Track January call will bring representatives from the Open Science Grid who will describe how their services fit into the national cyberinfrastructure ecosystem, facilitating distributed high throughput computing for researchers. Lauren Michael will share how the OSG team can partner with you by
- bringing the power of the OSG to YOUR researchers
- gathering science drivers and planning local computing resources or
- CC*-required resource sharing for the Campus Compute category, and other options for integrating with OSG
Systems-Facing Track (third Thursdays)
Thursday, January 21, 12p ET/ 11a CT/ 10a MT/ 9a PT
HPC Cluster Operating Systems Options
With the recent announcement from IBM/Redhat that CentOS 8 will be EOL at the end 2021 and CentOS 7 in 2024 many HPC systems professionals now face the prospect of having to migrate off the downstream distribution and either adopt CentOS 8 Stream or purchase licenses for RHEL. There are some institutions where clusters run other operating systems, most notably Cray has deployed SuSE as part of their Cray environment and NVIDIA DGX boxes notably run more smoothly with Ubuntu. Our January meeting we will have a panel discussion on this topic and explore experiences using/migrating to/from other distributions on their HPC resources.
Interested participants need not subscribe to a particular track to participate in calls. However, additional details for track members, including notes documents and any pre-call activities, will be distributed ahead of the call via the email lists and other communication channels within each track.
The CaRCC People Network, aims “to foster, build and grow an inclusive community (termed the “People Network”) for campus CI, research computing and data professionals.” If you have received this email NOT via CaRCC’s People Network, and you would like to join the People Network, which includes Researcher-facing, Data-facing, Systems-facing, and other tracks, please fill in the form at http://bit.ly/join_carcc_people_network.
All calls will take place within the same Zoom room distributed via email. Please join the People Network (link just above) or contact firstname.lastname@example.org for details.