Conventional cloud storage services are increasingly expensive and do not offer more significant incentives for their users, in addition to reducing the possibilities in data transfer. Also, because they are centralized services, they can be unreliable regarding their ability to preserve the integrity of the data.

Massive and Decentralized Storage of Information

One of the most disruptive applications of active crypto technology is the massive and decentralized storage of information. Decentralization being a concept that has hovered in various areas of communications, business, and social organization, Bitcoin technology presents the world with an option, even in the experimental phase, combining decentralized and permanent records, transparency and security with a system of incentives for the maintenance of the network.

On the other hand, data leakage has been a constant in the history of the internet, so companies or users that handle content that they consider should be protected, are migrating to crypto active networks as an effective and innovative solution for this. If the information gets stored in a single node, there would be the risk of losing it forever if that central base fails.

Blockchain Networks

Thus, various platforms and implementations dedicated to safeguarding the information of those users who do not have enough storage space have decided to place their trust in these protocols. However, we must remember that blockchain platforms are still projects in development, so it is convenient to keep track of them to avoid failures or bad practices that put our data in check.

In these blockchain networks, the information is protected in a shared way by multiple servers located around the world, who keep a copy of the chain of blocks. Also, decentralization allows the client or user to make transactions with your information or even edit it if you have the private keys unique to that record.

Somehow, you can compare these decentralized networks with the torrent services that are so popular to download movies, books, music, and many files. Working with a P2P logic, in the BitTorrent client a large number of users save a file and keep it online available to those who want to download it. The data can get duplicated, modified and distributed endless times.

One of the differences between torrent service and crypto active technology is that the former was not designed with a system of monetary incentives, and the work of those who participate in it are kind.


FileCoin is a cryptocurrency and protocol that works as a solution for data storage. Developed by Protocol Labs, the cryptocurrency is executed on top of the Interplanetary File System, seeking to create new ways to store and share information on the Internet.

However, its difference with web protocols lies in that instead of storing the files in a centralized URL; its routing algorithm allows you to obtain the content from any place or channel that connects to the nodes of your network.

Through a hash address, the content becomes immutable and gets protected against the decisions of third parties who may not want that content to exist or be visible to the public. Also, it allows the user to configure the levels of privacy from making the entire file visible until it is shared promptly with whomever he wishes.

Another advantage that allows the distribution of files through this network is that it is not only a server that stores information, but it gets fragmented between different nodes and users located around the world, independent and separated. In this way, users can rent their spare storage space to safeguard files from third parties and receive a reward for it, obtaining FileCoins for their work.

This operation is common to all the platforms in this list.


Sia is a protocol that emerged from the HackMIT event in 2013, a student meeting where different types of projects are developed and presented. Officially, Sia was launched in 2015 and also seeks to use the capacity of the memory units to create a decentralized mass storage market powered by the Siacoin currency.


Storj is a distributed storage project built on the Ethereum network. It is one of the most popular services of this type, with an active and large community of about 20,000 users and 19,000 guests, which gets reflected in its position as a market leader among all similar projects for mass distributed storage.


In the case of Swarm, it is not a blockchain protocol or platform, but rather a technical implementation of Ethereum for data storage. This tool will get activated in conjunction with the Whisper messaging service and the Ethereum Virtual Machine (EVM).

It should get noted that it is still an implementation in development since Ethereum’s team of collaborators continues to attend to various scalability solutions, so it will progressively come at some point.


Maidsafe is a company established in the United Kingdom in charge of implementing the SAFE Network, a decentralized network that uses the Resource Test as a consensus mechanism to store information.

Given its age, MaidSafe gets distinguished from other crypto projects in having much more time as an enterprise, one of the first to propose decentralization as a key to creating the internet of the future.

In theory, each computer queries a node randomly about the information collected and then disseminates it throughout the network allowing other servers to build an image of what is happening in real time.

The formulas to turn enormous amounts of data into information with economic value become the great asset of the multinationals.

Algorithms are a set of programming instructions that, logically introduced in software, allow to analyze a set of previously selected data and establish an “output” or solution. These algorithms are being used by companies mainly to detect patterns or trends, and based on this, generate useful data to adapt their products or services better.

It is not a novelty for companies to obtain data from advanced analytics to study the characteristics of the product they plan to put on the market; the price to which it wants to place it or even private decisions as sensitive as the remuneration policy for its employees. The surprising thing is the dimension.

It is not only that the number of data in circulation has recently multiplied to volumes that are difficult to imagine – it is estimated that humanity has generated 90% of the information of the whole history in the last five years. The possibilities of interconnecting them have also grown dramatically.

Algorithm revolution

This revolution has contributed to each of the millions of people who give their data every day for free and continuously, either uploading a photo to Facebook, buying with a credit card or going through the metro turnstiles with a magnetic card.

In the heat of giants like Facebook and Google, who base their enormous power on the combination of data and algorithms, more and more companies are investing increasing amounts of money in everything related to big data. It is the case of BBVA, whose bet is aimed both at invisible projects for customers -as the engines that allow processing more information to analyze the needs of its users- and at other easily identifiable initiatives, such as the one that enables bank customers to. Forecast the situation of your finances at the end of the month.

Dangers and Risks

The vast possibilities offered by the algorithms are not without risks. The dangers are many: they range from cybersecurity – to deal with hacking or theft of formulas – to the privacy of the users, going through the possible biases of the machines.

Thus, a recent study by the University Carlos III concluded that Facebook uses advertising for sensitive data of 25% of European citizens, who get tagged in the social network according to matters as private as their political ideology, sexual orientation, religion, ethnicity or health.
Cybersecurity, for its part, has become the primary concern of investors around the world: 41% said they were “apprehensive” about this issue, according to the Global Investors Survey of 2018.

What is the future of the algorithms?

This technology is fully functional to meet the objectives of almost any organization today, and although we do not know, is present in many well-known firms in the market. Its capabilities of analysis, prediction and report generation for decision making make it a powerful strategic tool.

Algorithms, either through specific applications or with the help of Business Intelligence or Big Data solutions open the way to take advantage of the information available in our company and turn it into business opportunities.

Thanks to the algorithms we know better how our clients and prospects behave, what they need, what they expect from us. And they also allow us to anticipate the actions of our competitors and market trends.

Like any technological innovation that has revolutionized our way of understanding the world since man is a man, it will take us some time to become aware of this new reality and learn to make the most of it. As citizens and as communicators we can turn algorithms into valuable allies.

The algorithm is at the heart of technologies potentially as powerful as artificial intelligence. Nowadays, algorithms are the basis of machine learning technologies, which surprise us every day with new skills. And it is behind techniques of the setting of virtual assistants or autonomous vehicles.

A programming language is an artificial language designed to express computations that can be carried out by machines such as computers. They can be used to create programs that control the physical and logical behavior of a device, to express algorithms with precision, or as a mode of human communication.

Is formed of a set of symbols and syntactic and semantic rules that define its structure and the meaning of its elements and expressions. The process by which you write, test, debug, compile and maintain the source code of a computer program is called programming.

Also, the word programming gets defined as the process of creating a computer program, through the application of logical procedures, through the following steps:

  • The logical development of the program to solve a particular problem.
  • Writing the logic of the program using a specific programming language (program coding).
  • Assembly or compilation of the program until it becomes a machine language.
  • Testing and debugging the program.
  • Development of documentation.

There is a common error that treats the terms ‘programming language’ and ‘computer language’ by synonyms. Computer languages encompass programming languages and others, such as HTML. (language for the marking of web pages that is not properly a programming language but a set of instructions that allow designing the content and text of the documents)

It allows you to specify precisely what data a computer should operate, how it should be stored or transmitted, and what actions to take under a variety of circumstances. All this, through a language that tries to be relatively close to human or natural language, as is the case with the Lexicon language. A relevant characteristic of programming languages is precisely that more than one programmer can use a common set of instructions that are understood among them to carry out the construction of the program collaboratively.

The implementation of a language is what provides a way to run a program for a certain combination of software and hardware. There are basically two ways to implement a language: Compilation and interpretation. Compilation is the translation into a code that the machine can use. The translators that can perform this operation are called compilers. These, like advanced assembly programs, can generate many lines of machine code for each proposal of the source program.

Imperative and functional languages

The programming languages ​​are generally divided into two main groups based on the processing of their commands:

  • Imperative languages
  • Functional languages.

Imperative programming language

Through a series of commands, grouped into blocks and composed of conditional orders, it allows the program to return to a block of commands All this if the conditions get met. These were the first programming languages ​​in use, and even today many modern languages ​​use this principle.

However, structured imperative languages ​​lack flexibility due to the sequentiality of instructions.

Functional programming language

A functional programming language (often called procedural language) is a language that creates programs employing functions, returns a new result state and receives as input the result of other purposes. When a task invokes itself, we talk about recursion.

The programming languages ​​can, in general, get divided into two categories:

  • Interpreted languages
  • Compiled languages

Interpreted language

A programming language is, by definition, different from the machine language. Therefore, it must get translated so that the processor can understand it. A program written in an interpreted language requires an auxiliary program (the interpreter), which converts the commands of the programs as necessary.

Compiled language

A program written in a “compiled” language gets translated through an attached program called a compiler that, in turn, creates a new independent file that does not need any other program to run itself. This file is called executable.

Also, it has the advantage of not needing an attached program to be executed once it has compiled. Also, since only one translation is necessary, the execution becomes faster.

The interpreted language, being directly a readable language, makes that any person can know the manufacturing secrets of a program and, in this way, copy its code or even modify it.


The implementation of a language is what provides a way to run a program for a certain combination of software and hardware. There are basically two ways to implement a language: Compilation and interpretation. Compilation is the translation into a code that the machine can use. The translators that can perform this operation are called compilers. These, like advanced assembly programs, can generate many lines of machine code for each proposal of the source program.


To write programs that provide the best results, a series of details must be taken into account.

  • Correction.  Programs are correct if they do what they should do as they got established in the phases before their development.
  • Clarity. It is essential that the program be as clear and legible as possible, to facilitate its development and subsequent maintenance. When developing a program, you should try to make its structure coherent and straightforward, as well as take care of the style in the edition; In this way, the work of the programmer is facilitated, both in the creation phase and in the subsequent steps of error correction, extensions, modifications, etc. Stages that can be carried out even by another programmer, with which clarity is even more necessary so that other programmers can continue the work efficiently.
  • Efficiency. The point is that the program does so by managing the resources it uses in the best possible way. Usually, when talking about the efficiency of a program, it is generally referred to the time it takes to perform the task for which it got created. And the amount of memory it needs, but other resources can also get considered when obtaining the efficiency of a program. It all depends on its nature (disk space it uses, network traffic it generates, etc.).
  • Portability. A program is portable when it can run on a platform, be it hardware or software, different from the one on which it got developed. Portability is a very desirable feature for a program, since it allows, for example, a program that has been designed for GNU / Linux systems to also run on the family of Windows operating systems. It will enable the program to reach more users more efficiently.

When they are not administered, the data can become overwhelming, which makes it difficult to obtain the information that is needed at the time. Fortunately, we have software tools that, although designed to address data storage effectively, discovery, compliance, etc., have as a general objective to make the management and maintenance of data easy.

What is structured data?

When we talk about structured data, we refer to the information usually found in most databases. They are text files usually displayed in rows and columns with titles. They are data that can be easily ordered and processed by all data mining tools. We could see it as if it were a perfectly organized filing cabinet where everything can get identified, labeled and easily accessible.

It is likely that most organizations are familiar with this type of data and are already using it effectively, so let’s move on to see the unstructured data.

What is unstructured data?

Although it seems incredible, the database with structured information of a company does not even contain half of the information that is available in the company ready to be used. 80% of the information relevant to a business originates in an unstructured form, mainly in text format.

Unstructured data is usually binary data that has no identifiable internal structure. It is a massive and disorganized conglomerate of several objects that have no value until identified and stored in an organized manner.

Once organized, the elements that make up their content can be searched and categorized (at least to some extent) to obtain information.

For example, although most data mining tools are not capable of analyzing the information contained in email messages (however organized they may be), it is possible that collecting and classifying the data contained in them can show us relevant information for our organization. It is an example that illustrates the importance and scope of unstructured data.

But e-mail has no structure?

The unstructured term faces different opinions for various reasons. Some people say that although a formal structure cannot get identified in them, it is possible that it could be implicit and, in that case, it should not get categorized as unstructured. However, on the other hand, if the data have some form of structure, but this is not useful and can not be used to process them, they should be categorized as unstructured.

Although e-mail messages may contain information with some implicit structure, it is logical to think of them as unstructured information, since common data mining tools are not prepared to process and analyze them.

Unstructured data types

Unstructured data is raw and unorganized data. Ideally, all this information could be converted into structured data. However, it would be somewhat expensive and would require a lot of time. In addition, not all types of unstructured data can easily be converted into a structured model. For example, following the e-mail example, an e-mail contains information such as the time of sending, the person to whom it is sent, the sender, etc. However, the content of the message is not easily divided or categorized and this can be a problem of compatibility with the structure of a relational database system.

This is a limited list of unstructured data types:

  • Emails.
  • Text processor files.
  • PDF files.
  • Spreadsheets.
  • Digital images
  • Video.
  • Audio.
  • Publications in social media.

Looking at that list, you could ask what these files have in common. These are files that can be stored and managed without the system having to understand the format of the data. Since the content of these files does not get organized, they can get stored in an unstructured way.

Precisely many qualified voices in the sector suggest that it is unstructured information that offers greater knowledge. In any case, the analysis of data of different types is essential to improve both productivity and decision making in any company.

The Big Data industry continues to grow, but there is a problem with unstructured data that do not get used yet. However, the companies have already identified the problem and technologies and services are already being developed to help solve it.

The strategic use of information gives companies a competitive response capacity that requires the search, management, and analysis of many data from different sources. Among this information, the secondary data have an essential weight when it comes to extracting value for use in research or studies.

Faced with primary information, created expressly for a specific study, the researcher also has secondary data, valid information already developed by other researchers that may be useful for particular research.

Likewise, these data may have been generated previously by the same researchers or, in general, by the same organization that conducts the study or, where appropriate, has commissioned it. That is why, as a general recommendation, the search should start with the internal data.

Regardless of whether they get obtained inside or outside the organization, the primary data generated in an investigation will be considered secondary data.  They can get used in others to save time and money, since it would not be feasible to carry them out for obvious budget issues or, just, it is unnecessary because it has already got done.

Internal and external secondary data

Once the search for internal information has to get completed, the researcher should focus on external secondary data sources, ideally following a previous plan that serves as a guide to a large number of sources available today.

Therefore, secondary information can get roughly divided into internal and external secondary data:

  • Internal secondary data– information that is available within the company is included, from accounting data or letters from customers or suppliers and vendor reports or surveys from the human resources department to, for example, previous research.
  • External secondary data– is data collected by sources external to the company. They can get found in other organizations or companies, such as census data, institutional statistics, government studies, organizations and associations, research and data disseminated in periodicals, in books, on the internet or, for example, the same digital data.

The growing importance of secondary information

Secondary data is more accessible to obtain, relatively inexpensive and available.

Although it is rare for secondary data to provide all the answers to an unusual research problem, such data may be useful for the investigation.

The use of secondary data in research processes is a common practice for years. However, with all this emergence of Big Data and the greater ease of access to different sources of information, its use has gained a strong impetus as a tool of business intelligence, mainly for the following reasons:

  • It is easy to access and economical information.
  • It serves as a point of comparison of the organizational results with respect to the market.
  • It serves to focus and define new organizational projects.
  • Allows estimation of quantitative benefits for new organizational projects (ROI)
  • It allows estimating future market behavior based on facts and data.
  • It facilitates the strategic decision making of organizations.

Among the disadvantages of the secondary data, we find that initially they could be investigated for different purposes to the current problem. It limits the information we can obtain and need for research.

It is likely that the objectives, nature, and methods used to collect the secondary data are not adequate for the present situation. Also, secondary data may be inaccurate or not completely current or reliable. Before using secondary data, it is important to evaluate them concerning such factors.

As a tool of great value, which helps to provide a clear competitive advantage, it is essential that organizations allocate technological and human resources to the establishment of processes aimed at the identification, selection, validation (verification of its accuracy, coherence, and credibility), processing and secondary information analysis.

A database performance monitoring and management tools can be used to mitigate problems and help organizations to be more proactive so that they can avoid performance problems and interruptions.

Even the best-designed database experiences degradation of performance. No matter how well the database structures are defined or the SQL code gets written, things can and will go wrong. And if the performance problems are not corrected quickly, that can be detrimental to the profitability of a company.

Performance of a Database

When the performance of the database suffers, business processes within organizations slow down and end users complain. But that is not the worst of all. If the performance of the systems they see abroad is bad enough, companies can lose business, as customers who are tired of waiting for the applications to respond will go elsewhere.

Because the performance of database systems and applications can be affected by a variety of factors, the tools that can find and correct the causes of database performance problems are vital for organizations that rely on them in database management systems (DBMS) to run your mission-critical systems. And in today’s IT world, focused on databases, that applies to most companies.

Types of performance problems you should look for

Many types of database performance problems can make it difficult to locate the cause of individual problems. It is possible, for example, that the database structures or the application code are flawed from the beginning. Bad database design decisions and incorrectly encoded SQL statements can result in poor performance.

It may be that a system was well designed initially, but over time the changes caused the performance to begin to degrade. More data, more users or different patterns of data access can slow down even the best database applications. Even the maintenance of a DBMS – or the lack of regular maintenance of databases – can cause performance to plummet.

The following are three important indicators that could indicate database performance issues in your IT department:

1. Applications that go slower. The most important indication of potential performance problems in the database is when things that used to run fast start running at a slower pace. Including online transaction processing systems that are used by employees or customers, or batch jobs that process data in large quantities for tasks such as payroll processing and end-of-month reports.

Monitoring a processing workload without database performance management tools can become difficult. In that case, database administrators (DBAs) and performance analysts have to resort to other methods to detect problems, in particular, complaints from end users about issues such as application screens taking too much time to upload or nothing to happen for a long time after the information is entered into an application.

2. System interruptions. When a system is turned off, the performance of the database is obviously at its worst. Interruptions can be caused by database problems, such as running out of storage space due to increased volumes of data or by a resource that is not available, such as a data set, partition or package.

3. The need for frequent hardware updates. The constantly upgrading of servers to larger models with more memory and storage are often candidates for database performance optimization. Optimizing database parameters, tuning SQL statements and reorganizing database objects can be much less expensive than frequently updating expensive hardware and equipment.

On the other hand, sometimes hardware updates are needed to solve database performance problems. However, with the proper tools for monitoring and managing databases, it is possible to mitigate the costs of updating by locating the cause of the problem and identifying the appropriate measures to remedy it. For example, it may be cost-effective to add more memory or implement faster storage devices to resolve I / O bottlenecks that affect the performance of a database. And doing so will probably be cheaper than replacing an entire server.

Problems that tools can help you manage

When the performance problems of the database arise, it is unlikely that its exact cause will be immediately evident. A DBA should translate vague complaints about end-user issues into specific issues, related to performance, that can cause the problems described. It can be a difficult and error-prone process, especially without automated tools to guide the DBA.

The ability to collect the metrics on database usage and identify the specific problems of the database – how and when they occur – is perhaps the most compelling capability of the database performance tools. When faced with a performance complaint, the DBA can use a tool to highlight current and past critical conditions. Instead of having to look for the root cause of the problem manually, the software can quickly examine the database and diagnose possible problems.

Some, database performance tools can be used to set performance that, once triggered, alert the DBA of a problem or trigger an indicator on the screen. Also, DBAs can schedule reports on database performance to be executed at regular intervals, in an effort to identify the problems that need to be addressed. Advanced tools can both identify, and help solve any situations.

There are multiple variations of performance issues, and advanced performance management tools require a set of functionalities.

The critical capabilities provided by the database performance tools include

  • Performance review and SQL optimization.
  • Analysis of the effectiveness of existing indexes for SQL.
  • Display of storage space and disk defragmentation when necessary.
  • Observation and administration of the use of system resources.
  • Simulation of production in a test environment.
  • Analysis of the root cause of the performance problems of the databases.

The tools that monitor and manage the performance of databases are crucial components of an infrastructure that allows organizations to effectively deliver the service to their customers and end users.

When we talk about measurement, we must understand how knowledge differs from data and information.

In an informal conversation, the three terms get often used interchangeably, and this can lead to a free interpretation of the concept of knowledge. Perhaps the simplest way to differentiate the words is to think that the data get located in the world and experience is located in agents of any type, while the information adopts a mediating role between them.

An agent does not equal a human being. It could be an animal, a machine or an organization constituted by other agents in turn.


A data is a discrete set of objective factors about a real event. Within a business context, the concept of data gets defined as a transaction log. A datum does not say anything about the way of things, and by itself has little or no relevance or purpose. Current organizations usually store data through the use of technologies.

From a quantitative point of view, companies evaluate the management of data regarding cost, speed, and capacity. All organizations need data, and some sectors are dependent on them. Banks, insurance companies, government agencies, and Social Security are obvious examples. In this type of organizations, good data management is essential for their operation, since they operate with millions of daily transactions. But in general, for most companies having a lot of data is not always right.

Organizations store nonsense data. This attitude does not make sense for two reasons. The first is that too much data makes it more complicated to identify those that are relevant. Second, is that the data have no meaning in themselves. The data describe only a part of what happens in reality and do not provide value judgments or interpretations, and therefore are not indicative of the action. The decision making will get based on data, but they will never say what to do. The data does not say anything about what is essential or not. In spite of everything, the info is vital for the organizations, since they are the base for the creation of information.


As many researchers who have studied the concept of information have, we will describe it as a message, usually in the form of a document or some audible or visible communication. Like any message, it has an emitter and a receiver. The information can change the way in which the receiver perceives something, can impact their value judgments and behaviors. It has to inform; they are data that make the difference. The word “inform” means originally “shape” and the information can train the person who gets it, providing specific differences in its interior or exterior. Therefore, strictly speaking, it is the receiver, and not the sender, who decides whether the message he has received is information, that is if he informs him.

A report full of disconnected tables can get considered information by the one who writes it, but in turn, can be judged as “noise” by the one who receives it. Information moves around organizations through formal and informal networks. Formal networks have a visible and defined infrastructure: cables, e-mail boxes, addresses, and more. The messages that these networks provide include e-mail, package delivery service, and transmissions over the Internet. Informal networks are invisible.

They are made to measure. An example of this type of network is when someone sends you a note or a copy of an article with the acronym “FYI” (For Your Information). Unlike data, information has meaning. Not only can it potentially shape the recipient, but it is organized for some purpose. The data becomes information when its creator adds sense to it.

We transform data into information by adding value in several ways. There are several methods:

• Contextualizing: we know for what purpose the data were generated.

• Categorizing: we know the units of analysis of the main components of the data.

• Calculating: the data may have been analyzed mathematically or statistically.

• Correcting: errors have been removed from the data.

• Condensing: the data could be summarized more concisely. Computers can help us add value and transform data into information, but it is tough for us to help analyze the context of this information.

The widespread problem is to confuse information (or knowledge) with the technology that supports it. From television to the Internet, it is essential to keep in mind that the medium is not the message. What gets exchanged is more important than the means used to do it. Many times it is commented that having a phone does not guarantee to have brilliant conversations. In short, that we currently have access to more information technologies does not mean that we have improved our level of information.


Most people have the intuitive feeling that knowledge is something broader, deeper and more productive than data and information. We will try to make the first definition of knowledge that allows us to communicate what we mean when we talk about knowledge within organizations. For Davenport and Prusak (1999) education is a mixture of experience, values, information and “know-how” that serves as a framework for the incorporation of new skills and knowledge, and is useful for action. It originates and applies in the minds of connoisseurs. In organizations, it is often not only found in documents or data warehouses, but also organizational routines, processes, practices, and standards. What immediately makes the definition clear is that this knowledge is not pure. It is a mixture of several elements; it is a flow at the same time that it has a formalized structure; It is intuitive and challenging to grasp in words or to understand logically fully.

Knowledge exists within people, as part of human complexity and our unpredictability. Although we usually think of definite and concrete assets, knowledge assets are much harder to manage. Knowledge can be seen as a problem or as stock. Knowledge is derived from information, just as information gets derived from data. For information to become knowledge, people must do practically all the work.
This transformation occurs thanks to

• Comparison.

• Consequences.

• Connections.

• Conversation.

These knowledge creation activities take place within and between people. Just as we find data in registers, and information in messages, we can obtain knowledge from individuals, knowledge groups, or even in organizational routines.

Information and data are fundamental concepts in computer science. A data is nothing more than a symbolic representation of some situation or knowledge, without any semantic sense, describing circumstances and facts without transmitting any message.

While the information is a set of data, which are processed adequately so that in this way, they can provide a message that contributes to the decision making when solving a problem. Also to increasing knowledge, in the users who have access to this information.

The terms information and data may seem to mean the same; however, it is not. The main difference between this concept is that the data are symbols of different nature and the information is the set of these data that have gotten treated and organized.

Information and data are two different things, although related to each other.

The differences between both are the following:


  • They are symbolic representations.
  • By themselves, they have no meaning.
  • They can not transmit a message.
  • They are derived from the description of certain facts.
  • The data is usually used to compress information to facilitate the storage of data, and its transmission to other devices on the contrary that the report, which tends to be very extensive.


  • It is the union of data that has been processed and organized.
  • They have meaning.
  • You can transmit a message.
  • Increase knowledge of a situation.
  • The information or message is much higher than the data since the data gets integrated by a set of data of different types.
  • Another remarkable feature of the information is that it is a message that has communicational meaning and a social function. While the data does not reflect any word and usually is difficult to understand by itself for any human being, lacking utility if it is isolated or without other groups of data that create a consistent message.

The main difference gets centered on the message that the information can transmit, and that a data on its own cannot perform. A lot of info is needed to create a news or information. There is a difference between data and information, and that this difference is quite significant. Therefore, these terms should not be confused, especially within the computing and computer field, as well as, within the area of ​​communications.

For this to be information as such, you must meet these 3 requirements:

  • Be useful– What is the use of knowing that “The price of X share will rise by 10% in the next 24 hours” if I want to see the definition of Globalization?
  • Be reliable– What good is a piece of information, if we do not know if it is true, accurate or at least reliable? Not every part of the data will be correct, but at least it must be reliable. It could be making a decision based on the wrong information.

  • Be timely– What is the use of knowing that it rains in the United States if I live in Argentina? I am looking to see if it will rain in the afternoon in my country to know if I should go out with an umbrella or not.

What is data?

Data are symbolic representations of some entity, can be alphabetic letters, points, numbers, drawings, etc. The data unitarily have no meaning or semantic value, that is, they have no impact. But when correctly processed, they become meaningful information that helps make decisions. The data can be grouped and associated in a specific context and produce the data.

Classification of data

  • Qualitative– Are those that indicate qualities such as texture, color, experience, etc.
  • Continuous– These are data that are expressed in whole or complete numerical form.
  • Discrete– These data are expressed in fractions or using decimals.
  • Quantitative– Data that refers to the numerical characteristic, can be numbers, sizes, quantities.
  • Nominal– They includes data such as sex, academic career, qualifications. They can be assigned a number to process them statistically.
  • Hierarchized– They are those that throw subjective evaluations and are organized according to achievement or preference.

What is information?

Information is the grouping of data whose organization allows to convey a meaning. It will enable the uncertainty to decrease and the knowledge to increase. The info is elementary to solve problems because it provides everything necessary to make appropriate decisions.

In an organization, information is one of its most vital resources so that it lasts over time. For data to become information must be processed and organized, always fulfilling some characteristics, some exclusionary, others only important but may not be.

Characteristics of the information

  • Relevance– Must be relevant or important to generate and increase knowledge. The incorrect decision making is often due to the grouping of too many data, therefore the most important ones must be collected and grouped.
  • Accuracy– must have sufficient accuracy, taking into account the purpose for which it is needed.
  • Complete– All the information needed to solve a problem must be complete and available.
  • Reliable source– The information will be reliable as long as the source is reliable.
  • Deliver to the right person– The information must be given to whoever is entitled to receive it, only then can it fulfill its true objective.
  • Punctuality– The best information is the one that is communicated at the precise moment when it is needed and will be used.
  • Detail– You must have specific details so that this is effective.
  • Comprehension– If the information is not understood, it can be used and will not have any value for the recipient.

The process of transformation of data into information and knowledge

There are many instances from which one receives data until that data is a factual knowledge that we will enjoy benefits, and even one of those intermediate instances is information.

The process will vary depending on the sample (type, quantity, and quality of data) and depending on our objectives, but the process is somewhat similar to this:

  • Data – We receive a series of data, which may be few or many, may be useful or not, we still do not know.
  • The data are selected – Now we have to see them, one by one and we have to really see which ones are useful to us. Based on this we will have a list of selected data.
  • Pre Process – Now with that data selected, now perhaps only 20% of those that were original, we have to organize them to be able to enter them into some processing system.
  • Processed data – They are no longer just selected data, now they are organized and processor, now we are faced with a professed transformation of those data because we are looking for a result.
  • Transformed data – It is no longer raw data much less, and practically has the form of information and in fact, roughly we can find certain things that may get our attention.
  • Patterns – When we repeatedly have precise information and apply it to look for patterns, in some occasions that information can be useful, reliable and obviously timely, but nobody has the absolute truth; Some piece of information may have some error/deviation, however slight it may be.

Enterprise-level companies work with a large volume of data, which makes their analysis and subsequent decision-making complex. It’s necessary to combine data from diverse sources in order to obtain insights and analyze information about consumers and the market. In this article, we are going to address the four types of data analytics that you can (and should) use in your business.

Descriptive analysis

In a business, this refers to the main metrics within the company. For example, profits and losses in the month, sales made, etc. This data analysis answers the question, “what’s happening now?” Companies can analyze data on the customers ​​of a specific product, the results of campaigns launched, and other pertinent sales info.

Descriptive analysis allows companies to make immediate decisions with a high level of surety since they’re using concrete and up-to-date data. The information coming from this type of analysis is often displayed in graphs and tables, which allows the managers to have a global vision of the monitored data.

Predictive analysis

Predictive analysis has to do with either the probability of an event occurring in the future, the forecast of quantifiable data, or the estimation of a point in time in which something could happen through predictive models.

This type of analysis makes forecasts through probabilities. This is possible thanks to different predictive techniques, which have been honed
in the stock and investment market.

Diagnostic analysis

The next step in complexity of data analysis, diagnostic analysis requires that the necessary tools must be available so that the analyst can delve deeper into the data and isolate the root cause of a problem.

Diagnostic analysis seeks to explain why something occurs. It relates all the data that is available to find patterns of behavior that can show potential outcomes. It is essential to see problems before they happen and to avoid repeating them in the future.

Prescriptive analysis

Prescriptive analysis seeks to answer the question, “what could happen if we take this measure?” Authoritative studies raise hypotheses about possible outcomes of the decisions made by the company. An essential analysis for managers, it helps them to evaluate the best strategy to solve a problem.

Analyzing data is essential to respond to the constant challenges of today’s competitive business world. It’s no longer enough to analyze the events after they have occurred — it’s essential to be up to date with what’s happening at each moment. Monitoring systems are necessary tools in the business world of today because they allow us to analyze to the second what is happening in the company, enabling immediate action — and hopefully bypassing severe consequences.

An excellent example of this is a traffic application that helps you choose the best route home, taking into account the distance of each route, the speed at which one can travel on each road and, crucially, the current traffic restrictions.

While different forms of analysis can provide varying amounts of value to a business, they all have their place.

Processing techniques and data analysis

In addition to the nature of the data that we want to analyze, there are other decisive factors when choosing an analysis technique. In particular, the workload or the potentialities of the system to face the challenges posed by the analysis of extensive data: storage capacity, processing, and analytical latency.

Stream or stream processing is another widely used feature within Big Data analytics, along with video analytics, voice, geo-spatial, natural language, simulation, predictive modeling, optimization, data extraction and, of course, the consultation and generation of reports. When making decisions aiming for the highest value to one’s business, there’s a wide variety of advanced analytic styles to choose from.

Global warming, terrorism, DoS attacks (carried out on a computer system to prevent the access of its users to their resources), pandemics, earthquakes, viruses — all pose potential risks to your infrastructure. In the 2012 Global Disaster Recovery Index published by Acronis, 6,000 IT officials reported that natural disasters caused only 4% of service interruptions, while incidents in the servers’ installations (electrical problems, fires, and explosions) accounted for 38%. However, human errors, problematic updates, and viruses topped the list with 52%.

The 6 essential elements of a solid disaster recovery plan

Definition of the plan

To make a disaster recovery plan work, it has to involve management — those who are responsible for its coordination and ensure its effectiveness. Additionally, management must provide the necessary resources for the active development of the plan. To make sure every aspect is handled, all departments of the organization participate in the definition of the plan.


Next, the company must prepare a risk analysis, create a list of possible natural disasters or human errors, and classify them according to their probabilities. Once the list is completed, each department should analyze the possible consequences and the impact related to each type of disaster. This will serve as a reference to identify what needs to be included in the plan. A complete plan should consider a total loss of data and long-term events of more than one week.

Once the needs of each department have been defined, they are assigned a priority. This is crucial because no company has infinite resources. The processes and operations are analyzed to determine the maximum amount of time that the organization can survive without them. An order of recovery actions is established according to their degrees of importance.

In this stage, the most practical way to proceed in the event of a disaster is determined. All aspects of the organization are analyzed, including hardware, software, communications, files, databases, installations, etc. Alternatives considered vary depending on the function of the equipment and may include duplication of data centers, equipment and facility rental, storage contracts, and more. Likewise, the associated costs are analyzed.

In a survey of 95 companies conducted by the firm Sepaton in 2012, 41% of respondents reported that their DRP strategy consists of a data center configured active-passive, i.e., all information supported in a fully set data center with the critical information replicated at a remote site. 21% of the participants use an active-active configuration where all the company’s information is kept in two or more data centers. 18% said they still use backup tapes; while the remaining 20% ​​do not have or are not planning a strategy yet.

For VMware, virtualization represents a considerable advance when applied in the Disaster Recovery Plan (DRP). According to an Acronis survey, the main reasons why virtualization is adopted in a DRP are improved efficiency (24%), flexibility and speed of implementation (20%), and cost reduction (18%).

Essential components

Among the data and documents to be protected are lists, inventories, software and data backups, and any other important lists of materials and documentation. The creation of verification templates helps to simplify this process.

A summary of the plan must be supported by management. This document organizes the procedures, identifies the essential stages, eliminates redundancies and defines the working plan. The person or persons who write the plan should detail each procedure, and take into consideration the maintenance and updating of the plan as the business evolves.

Criteria and test procedures of the plan

Experience indicates that recovery plans must be tested in full at least once a year. The documentation must specify the procedures and the frequency with which the tests performed. The main reasons for testing the plan are verifying its validity and functionality, determining the compatibility of procedures and facilities, identifying areas that need changes, training employees, and demonstrating the organization’s ability to recover from a disaster.

After the tests, the plan must be updated. As suggested, the original test should be performed during hours that minimize disruption in operations. Once the functionality of the plan is demonstrated, additional tests should be done where all employees have virtual and remote access to these functions in the event of a disaster.

Final approval

After the plan has been tested and corrected, management must approve it. They’ll be in charge of establishing the policies, procedures, and responsibilities in case of contingency, and to update and give the approval to the plan annually. At the same time, it would be advisable to evaluate the contingency plans of external suppliers. Such an undertaking is no small feat, but has the potential to save any company when disaster strikes.