What is data proliferation and how does it affect Carleton?

13 April 2023
By Markus Gunadi

Introduction:

illustration of 3 people running away from a large wave made up of numbers. the wave of numbers represents the proliferation of data.

As of July 2022, Google is no longer offering free unlimited storage to educational institutions. With the added costs for data storage, institutions have been asking themselves what may be the most efficient ways to store data without increasing costs significantly. However, this begs the question for all people living in today’s digitalized world. Should we be more careful about data and what we keep? What are the consequences of keeping all our data? What are the benefits to keeping data? Is it useful? Or is most data being stored for no reason?

[For this article I will refer/define data as any stored information on a computer]

To understand this problem more easily, we can look at a smaller example, the storage of my iphone! As a young college student, I am always downloading movies, music, and photos onto my iphone. In addition, I am a mobile game addict when it comes to games like Candy Crush and Clash of Clans. Over the summer I had a grave problem. I wanted to download a new fun game but it required around two gigabytes of storage and I only had 1.5 gigabytes remaining. I had some options; I could buy more iCloud storage from Apple at a premium or I could delete stored stuff I don’t use or need (e.g. videos or all my texts that were over a year old), or I could move some stuff off my phone to another storage location. I ultimately decided that I would not get the game because I was not willing to pay extra, delete my photos and messages, or sort out where else I could put stuff. Still, I learned about which types of files took up most of my storage space (video and apps) and which types of data were most important to me (memories of my friends and family). 

Despite my decision to prioritize my data, I realized that I hadn’t thought about most of it for years. Yet it still made sense to save photos and videos, but did I really need all those old text messages or meditating self-help apps? It felt like I was a hoarder, but for things that didn’t even exist. The pictures saved on my phone weren’t physical but I began to wonder if there were any harms from my “hoarding” of data.

What are the dangers of data proliferation?

Turns out, a lot of problems come with storing data in today’s age. Besides all of the ethical questions there’s the misuse of personalized data leaks. For example, in 2018 millions of Facebook users had their personal information collected for personalized political advertisements (New York Times). In today’s age, consumers are often turned into products for companies. Our data and private information is being sold to advertisers for targeted ads and has changed the very meaning of privacy in our society.

Furthermore, there’s also an environmental concern to storing all of this data. According to Steven Gonzalez Monserrate of the MIT Press Reader, “The Cloud now has a greater carbon footprint than the airline industry. A single data center can consume the equivalent electricity of 50,000 homes.” Data storage centers are a central part of today’s Internet driven world, however one cannot ignore the resources they consume to run 24/7. Despite new environmentally minded goals from tech giants, these data centers will continue to use coal or other intense-carbon producing energy sources far into the future. Society’s ignorance of constantly collecting and saving data is destroying our planet.

And these problems with data aren’t ending anytime soon. According to Michael Coony, an editor for Network World, “By 2022, more IP traffic will cross global networks than in all prior internet years combined up to the end of 2016. In other words, more traffic will be created in 2022 than in the first 32 years since the internet started”. While 2022 has already long passed, data proliferation is still continuing with the IDC (International Data Corporation) estimating that in 2025, the world will create and replicate 163ZB of data. 

What can institutions and individuals do to help?

Ultimately, we are lucky that as a privileged institution, we do not need to worry about deleting important data because of the associated costs, unlike I and my iphone. However, this doesn’t mean we can’t stop thinking about these problems. We need to prioritize storing only what we need  and in doing so we can also make our own lives easier. 

Often when I look at my peers’ and family’s computers, they have no organization system with their files. And if they do, most files probably haven’t been touched in years and will not be used in the future ever. We need to curate our files by trashing the documents we don’t need and organizing our other files in a way we can find easily. One way we can do this is through using the FAIR data principles.

FAIR data principles stands for “Findable, Accessible, Interoperable, and Reproducible.” Together, following FAIR principles essentially means your files can be found when you need them, are stored in a safe location, work without access to an old subscription/account or software, and are able to be reconstructed. Essentially, you should be able to easily locate and use all your intentionally saved files – and delete the extra stuff!  If everyone does this, not only will we be able to reduce carbon emissions significantly, we will be able to access and use what we intentionally saved.

Side Tip: Here is one helpful Google Drive tip from our Academic Technologists, Paula Lackie and Wiebke Kuhn, for organizing and curating your files. Taken from this Skillshop presentation (only for Carleton emails).

  1. Type into the Search in Drive “is:unorganized owner:me”
  2. To delete a folder as well as its contents, you MUST select all the files (not sub folders) and first delete them.  THEN delete the folder. 
  3. This is tedious & there’s no way around it right now.
  4. Remember, you can only delete something that you own. (This can get convoluted in Shared Drives)
Picture of Google Slide Instructions
Google Drive search fields

Paula Lackie likens files to the childhood art your parents hung on a fridge. If your parents hang every project you ever did on a fridge and never trashed anything, would you even be able to use your fridge? .. or even the kitchen? If you wanted to find an important drawing of a flower you made for your grandma, would you be able to find it? Curation allows us to keep only the most important and impressive pieces of work in our fridge while still allowing us to use our kitchen.

Picture of FAIR logo

GO FAIR. Retrieved 2021-08-16.  Material was copied from this source, which is available under a Creative Commons Attribution 4.0 International License.

Final Thoughts

We need to continue to learn how to be ethical in the ways we collect data to help our student body and surrounding communities. We need to focus on the fight against climate change by continuing to be more sustainable and divesting from fossil fuels. We have to continue learning and teaching to solve problems that threaten our world. Before learning about FAIR data principles, I had no file curation and would never be able to find important documents to help me with school projects, job applications, etc. Thinking more about minimizing storing excess files has helped me be more organized while also saving space on my computer. Although I’m still far from perfect, I hope to continue improving my file management skills and hopefully so will you!

Posted In