RO EN
BIG DATA - large-scale databases
Doru Bulubasa
03 November 2016

BIG DATA is a concept that has started to take shape recently. Day by day, databases grow exponentially due to the increasing number of internet users. Even the simple access to the most basic website in the world can be recorded.

If we use a search engine, we will appear in a database of internet habits with the keywords we used. Thus, we can say that each click generates a record. Imagine how much data is generated if 40,000 searches are made in just one second (this is roughly what Google claims).

BIG DATA can be described by the following characteristics:

1.    Volume – The amount of data generated and stored. The size of the data determines the potential value and understanding and whether it can actually be considered BIG DATA.
2.    Variety - The type and nature of the data. This helps people use it efficiently to analyze and understand the results of the analysis.
3.    Velocity – Data is generated at increasingly high speeds. Therefore, the speed of analysis and interpretation must also be high to keep up.
4.    Variety – As the volume of data has increased, so has the number of sources and types of data. The vast majority of this data is unstructured, which significantly complicates the analysis process.
5.    Veracity - The quality of the captured data can vary greatly, affecting accurate analysis.

BIG DATA has launched a true industry of processes, personnel, and technology to exploit the immense potential of this new frontier. Large companies like Amazon, Wal-Mart, Google, Microsoft, etc., use BIG DATA in shaping future strategies. But it also plays an important role for small and medium-sized companies to better organize themselves or to establish their business strategies.

What exactly is BIG DATA?

Theoretically, it is not a new concept as it was born in 2001. It is the information held by any company, obtained and processed through new techniques to bring benefits in the most efficient way possible.

Companies have strived for decades to use the information they hold to grow or diversify a business. BIG DATA is special because it represents a significant volume of information – which can "open" opportunities, but also the way this information is analyzed can help this "opening."

Interpretation can reveal perspectives that are not immediately visible or that could not be found using traditional methods. The entire interpretation process focuses on finding paths, trends, or patterns normally invisible. This very thing requires new technologies and skills.

Therefore, the days when the data held by a company are stored in nicely organized Office documents and shared among employees are almost gone. If in 2002, 50 GB of file storage space was more than enough, today it most likely represents the volume of data generated in one minute.