Science for the Future

With data spilling from all areas of life, students need tools and training to understand it. The UNC School of Data Science and Society is here to help.

June 20th, 2023

Marcel Ravidat could never imagine what lay ahead when he and his dog, Robot, came across a foxhole in the southwestern French countryside in 1940. The tunnel was rumored to be a secret passage to a nearby manor, and the curious 18-year-old returned with three friends to investigate further. After days of digging, they found an ancient masterpiece 50 feet underground — the cave art of Lascaux. Giant paintings and carvings of animals depict a variety of scenes throughout the chambers.

To data scientists, sites like Lascaux are full of not only artifacts, but raw data. While a spreadsheet of numbers may come to mind when one thinks of data, its definition is much more broad: factual information used as the basis for reasoning, discussion, or calculation.

The people who created the work almost 20,000 years ago lit the cave with fireplaces and sandstone lamps fueled by animal fat. They used their hands, brushes, and hollow bones to apply the paint. Scenes give insight into the fauna of the time, depicting some species that have long been extinct. The red, yellow, and black pigments were prepared by mixing or heating minerals like hematite, iron oxyhydroxides, charcoal, and manganese oxide. The closest known source of this type of manganese oxide was about 150 miles away, leading to the conclusion that these people conducted trade or used supply routes.

All of this is data — and it gives valuable insight into the lives of humans in Europe during the Upper Paleolithic era.

“Data has been here forever. Before we humans were here, there was data about the universe,” says Jay Aikat, vice dean of the UNC School of Data Science and Society (SDSS) and research professor of computer science. “It’s a matter of us paying attention to the data and what we’re doing with it to advance humankind.”

That advancement is at the forefront of SDSS. Launched in 2022, the school is centered around progressing the field of data science and understanding how it impacts society.

As a hub of the technology and biotech industry, the Research Triangle is a fitting spot for a new school of data science. With around 4,000 tech companies, some of the fastest growing segments in the Triangle are in areas like analytics, nanotechnology, and wearables. Giants like Vinfast, Wolfspeed, and Apple — as well as a myriad of smaller companies and startups — are keen to hire graduates with strong data literacy.

For better or worse, data is pervasive in every area of life. Just like researchers’ understanding of the life of the Lascaux artists based on artifacts, data scientists can analyze behavior through modern technology. Our physical activity, internet search queries, purchasing and television preferences, and driving tendencies are all tracked through means like smart devices, financial transactions, and security cameras.

“Data has a persuasive force. It has the ability to make arguments seem more plausible, more impactful, more strong,” says Stan Ahalt, dean of SDSS and professor of computer science. “People who use data have a persuasive podium. That podium can be used for very positive things, but it also could be used to distort things in a certain way.”

It seems like the saying “there’s data supporting this” has become the new “I saw it on TV” or “I read it on the internet.” Because of this persuasive force, Aikat says it’s imperative that students have a holistic understanding of data.

“How we collect the data, how we analyze the data — are we thinking about privacy and security as we’re collecting data? All of those things matter,” Aikat says. “So, we need to make sure that students are data literate. All our students are data literate.”

SDSS will launch its online Master of Applied Data Science program in January 2024. Graduates will gain general skills in programming, statistics, mathematics, and data management, ethics, and governance, as well as specialized skills in machine learning, visualization, and communication. This program will be followed by undergraduate and graduate degrees and a certificate program for working professionals.

The goal is to give students multiple avenues to incorporate data science courses into their degree, according to Aikat. This cross-disciplinary focus also drives how SDSS approaches research.

“We’re not just a silo as a school,” she says. “The success of the school really depends on having very strong collaboration with many different units across campus for UNC as a whole to be a data science powerhouse.”

The concept for SDSS began moving forward in 2016, when Carolina geneticist and former vice chancellor for research Terry Magnuson charged the first committee to start thinking big about data science. As the founder of UNC-Chapel Hill’s genetics department, Magnuson has decades of experience building research units, engaging with industry partners, and pushing science policy at local and national levels.

SDSS is currently in discussion with faculty across campus and industry partners to pinpoint research concentrations within the school. Broadly speaking, research will be centered around how data science can be used to solve social issues.

“At the core of UNC is ‘Service to the State,’ and the UNC School of Data Science and Society very much sees that […] as part of our mission,” Aikat says. “All of the research that we’re doing is focused on problem areas. How can we take what we’ve learned and apply that to social problems?”

Ahalt stresses that data training is important for students in any field — from the hard sciences to art history and everything in between.

“I’m increasingly convinced that any disciplinary area is going to be impacted by data increasingly as time goes by,” Ahalt says. “We haven’t been confronted with the volume of data that is now occurring in many industries. And so, as a consequence, I think any student in any degree is going to make themselves much more future-proofed by having the ability to use data in a very facile way.”

Stan Ahalt is the dean of the UNC School of Data Science and Society and professor in the Department of Computer Science within the UNC College of Arts and Sciences. He is also the executive advisor and domain scientist for team science at the Renaissance Computing Institute and associate director of informatics and data science at the North Carolina Translational and Clinical Sciences Institute.

Jay Aikat is the vice dean of the UNC School of Data Science and Society and research professor in the Department of Computer Science within the UNC College of Arts and Sciences.