A Brief History of Data
The history of data recording and its manipulation by the human civilisation is quite fascinating. Contrary to popular belief, the concept of Data exists long before the first premises of the Roman Empire or the democracy in Ancient Greek. First, the word data comes from the Latin language, which is the plural of datum, where it means "something given". The word referred to things known or assumed as facts that could be used for reasoning or analysis. From this, let's quickly investigate the story of data and how it has evolved over the history?
Data for Preserving Knowledge Through the Ages
The Ishango Bone, a baboon bone, is one of the first traces of data recording used by our ancestors in the Palaeolithic era in 19 000 BC. It contained some carved markings, possibly showing an early form of counting. Unlike random scratches, the markings appear to follow a structured system, making it one of the earliest examples of data representation and record keeping.
Over the centuries, human civilisation has continued to store information in order to gain knowledge and find new ways to grow and develop its tools. The Library of Alexandria in Egypt, for example, was famous for storing over half a million documents before it was destroyed by fire, according to historians' estimates.
Later, in the 15th century, the invention of printing marked a major step forward in the process of documenting information. It allowed people to produce books and newspapers on a massive scale. In some ways, it helped to standardise records and establish common policies. The first steps towards data governance and standardisation of records.
In the 1600s, a statistician called John Graunt looked at birth and death rates in London. By finding patterns in his data, he helped the community to have a clear view on the population's growth. As of today, he is considered as one of the first men to use statistics records to better understand the evolution of the world around him.
Data as the Backbone of Workforce Management and Trade during the industrial revolution
Data has been a valuable asset for countries to face the industrial revolution when the first census was conducted in the United States in 1790 and in the United Kingdom in 1801 and to manage workforce management. With the multiple developments of railways, the growth of trade exchanges or the tracking of goods, data records have become more diverse and have been used as a source for future project developments.
In the 1900s, Fritz Pfleumer's invention of magnetic tape became the standard for recording data such as audio and video and was used in the development of the first computer systems. It provided an inexpensive, high-capacity and reliable way to store large amounts of information, and because of its low cost, magnetic tape lasted for several decades until new innovations such as the hard disk drive came along.
The Second World War saw the birth of electronic computers. These machines were very powerful at decoding messages. It was also the beginning of the digital age. 30 years later, in 1970, another breathtaking innovation was unveiled by a British engineer: Edgar F. Codd of IBM. He created the concept of the relational database to solve complexity and created the notion of related records. This allowed IT people to have more leeway and fewer challenges when analysing data.
Data through the Era of the Internet and Big Data
At this time, The outstanding innovation was Tim Berners-Lee's invention of the World Wide Web in 1989, which revolutionised the way people access, share and interact with information. His creation provided a global platform for communication, e-commerce, education and entertainment, fundamentally changing society. From this point forward, new opportunities were found to create data records on any subject.
As more people and businesses connected online, the volume of data grew exponentially. Several social media sites, such as Facebook, came online and began to develop data based on user interaction.
The rise of smartphones in the 2000s meant that people had the possibility to share their information everywhere, leading to the era of Big Data. The Big Data label was popularised in the early 2000s, and while it's not known exactly who invented the term, it refers to a large amount of data that can't be analysed or processed using traditional tools. Instead, specialised software programs are used. Complex data can refer to unstructured data (text, video...) with high variability or non-linear patterns.
As a result, several tools have been developed to support these large amounts of data. Big Data has been a boost for data and digital innovation, creating new opportunities to produce qualified analyses, make real-time decisions.
Companies were now able to use advanced algorithms to deliver hyper-personalised recommendations. Streaming services like Netflix, music platforms like Spotify and e-commerce platforms like Amazon are using big data to deliver personalised content, advertising and product suggestions, creating a more engaging and tailored user experience.
The balance between Data innovation and Data privacy
As digital platforms and online services have become an integral part of everyday life, organisations have gained unprecedented access to users' personal data. In many cases, they are unaware of how their data is being used, sold, or shared. Indeed, consumers have indeed been largely excluded from the value created by their own data.
For this reason, the full exploitation of personal information from data logs reveals some ethical limitations. For this reason, governments and related bodies such as the European Union have enacted laws in 2016 that regulate the use of personal data and the length of time it may be retained. Some argue that strict data protection rules could hinder data innovation and the improvement of the user experience. In particular, the emergence of artificial intelligence (AI) rely heavily on vast amounts of data to drive insights and personalisation.
In addition, the quality of data is critical for AI to work effectively - accurate, diverse and high-quality datasets enable more reliable predictions, better decision-making and improved user experiences. Restrictions on data access could limit the availability of such high-quality datasets, ultimately stalling progress and reducing the potential benefits of AI.
This is particularly true today with the race between enterprises to develop the most efficient generative AI. Indeed, in order to build powerful and accurate models that respond correctly to user prompts, they need access to diverse and high-quality data, including personal data. Since 2022, the competition has intensified, with companies striving to refine their models by leveraging extensive datasets.
In conclusion, finding a balance today between data innovation and data protection is essential to ensure that technological progress can continue without compromising the fundamental rights of individuals. On one side, data-driven innovation, in particular through AI and machine learning, has immense potential to improve user experiences, optimise services and drive economic growth. On the other hand, robust data protection is essential to protect personal data and maintain trust in digital platforms.