Tag

web hosting

Browsing

The data visualization allows us to interpret information in a simple and very visual way. Its primary objective is to communicate information clearly through graphics, diagrams, or infographics.

Sometimes, we are not aware of the importance of data in our routine life. We believe that it is something close to the professional world when, for example, simple indicators such as the percentage of your mobile’s battery or your car’s consumption data that will allow you to save fuel are fundamental.

At a professional level, the reading of data and its graphics visualization is a priority. Because at the end of the day they are the indicators that allow us to understand the tendency of the results. This, whether we are improving, maintaining the line or, on the contrary, getting worse in the tasks carried out by the different work team. Since at the end of it depends directly on the scope or not of the marked business objectives. Therefore, it is necessary to monitor these data constantly, to have a diagnosis of the company’s health at the moment.

The best way is to translate the data into a visual, graphic image, through some of the best tools available in the market. Most work in a similar way, importing the data, offering different ways of viewing and publishing them; all this with a simple usability level, according to people who are not experts in the field and with the necessary adaptation so that they can get seen in the different technological formats available in the market, including mobile ones.

Here are some and their main features:

Data Studio (Google)


The Californian giant is present in a leading role in the data visualization market thanks to Google Data, a free and easy to use tool. It connects with other means such as Google Analytics or Adwords, and through payment, you can also do it with others such as Facebook. It is accessed through the browser without the need to install additional software.

Tableau


It is a favorite Business Intelligence tool that allows the interactive visualization of data. It is an ideal option for all audiences, whatever the purpose, since through its website they offer good tutorials to familiarize yourself with it. It only requires the initial investment in the license that best suits your needs after the end of the trial period. It meets all levels of demand and is a great choice also as a partner for corporate purposes.

Power BI


Microsoft also designed a set of tools dedicated to BI, from an editor and data modeling to visualization applications. It requires the download of software that fits your operating system and has a free version that can get expanded with personalized payment packages. It is intuitive and powerful, but not as simple to use as others in this list of options, hence it is focused mainly on business purposes of a particular demand.

Datawrapper


Another free tool that offers a wide range of solutions to visualize imported data, from simple bar graphs too much more complex options.

Infogram


This tool is a favorite especially among the media and educational purposes because their graphics can be added elements to the consumer’s taste as templates, icons, and even images and videos.

Qlikview

It has a free version that allows analyzing and creating dashboards, as well as manipulating and interacting with the information. The special features are limited to your payment service which you can access in test mode for free. It is a support that allows you to develop connections with other intermediate applications so that knowledge of programming languages ​​will enable you to squeeze it much better.

Picktochart


It is a data visualization tool specialized in infographics — thousands of templates and elements to create them in a personalized way that can be downloaded in different high-resolution formats or shared in an interactive way.

Chartblocks

It is a more modest tool, but that according to your needs can be enough because it allows you to create graphics with great simplicity and then share them and display them in high resolution in any format.

The best thing, even if they all work similarly, is to choose the one that best meets the demands you need. It is not the same to look for a tool that allows you to build simple graphs that require advanced business intelligence functions. Therefore, within the list, there are eight options with different levels of development and functionalities. In each of its web pages, you can deepen more about them before opting for one.

In the information age, data has become an essential and essential element for any brand that wants to develop a precise and effective strategy and achieve the engagement of its target.

For this, many companies invest a lot of money in recruiting the best talent in this field, but when it comes to choosing which is better, a data scientist or a data analyst? And more importantly, do companies know what the difference between them is?

Although both professions are vital for the marketer world, it is essential to understand the differences between their jobs depending on the approach you want to give to a strategy. The truth is that the industry tends to name these professionals indistinctly and has generated a confusion that we want to clear up.

Advent of the data scientist

Companies saw the availability of large volumes of data as a source of competitive advantage and realized that if they used this data effectively, they would make better decisions and be ahead of the growth curve. The need arose for a new set of skills that included the ability to draw client/user perceptions, business acumen, analytical skills, programming skills, analytical skills, machine learning skills, visualization of data and much more. It led to the emergence of a data scientist.

Data scientists and Data analysts

Data scientist– You probably have a strong business sense and the ability to communicate effectively, data-driven conclusions to business stakeholders. A data scientist will not only deal with business problems but will also select the right issues that have the most value to the organization.

A data scientist and an analyst can take Big Data analytics and Data Warehousing programs to the next level. They can help decipher what the data is saying to a company. They are also able to segregate relevant data from irrelevant data. A data scientist and an analyst can take advantage of the company’s data warehouse to go deeper into them. Therefore, organizations must know the difference between data scientists and data analysts.

Data scientists are a kind of evolution of the role of analysts but focus on the use of data to establish global trends on the problems of a company to solve them and improve business strategy.

Data Analyst– Your job is to find patterns and trends in the historical data of an organization. Although BI relies heavily on the exploration of past trends, the science of data lies in finding predictors and the importance behind those trends. Therefore, the primary objective of a BI analyst is to evaluate the impact of certain events in a business line or compare the performance of a company with that of other companies in the same market.

The data analyst has the primary function of collecting data, studying it and giving it a meaning. It is a process that can vary depending on the organization for which you work, but the objective is always the same, to give value and meaning to some data that by itself has no use. Thus, the result of analyzing, extrapolating and concluding is a piece of relevant information by itself, comparable with other data and use to educate other industry professionals about its applications.

An analyst usually relies on a single source of data such as the CRM system while a data scientist can conclude from different sources of information that may not be connected.

Main differences between the two

  • Usually, a data scientist expects to ask questions that can help companies solve their problems, while a BI data analyst answers and answers questions from the business team.
  • It is expected that both functions write queries, work with engineering teams to obtain the correct data and concentrate on deriving information from the data. However, in most cases, a BI data analyst is not expected to construct statistical models. A BI data analyst typically works on simpler SQL databases or similar databases or with other BI tools/packages.
  • The role of the data scientist requires strong data visualization skills and must have the ability to convert data into a business history. Typically, a BI data analyst is not expected to be an expert in business and advanced data visualization.

Companies must know how to distinguish between these two functions and the areas in which a data scientist and a business analyst can add value.

Information is an essential asset for any organization and the potential of its value lies in the data that, on occasion, must be migrated to improve the performance of a database, update versions, reduce costs or implement security policies.

But what is data migration?

This process consists of the transfer of data from one system to another and usually takes place at times of transition caused by the arrival of a new application, a change in the mode or storage medium or the needs imposed by the maintenance of the base of corporate data.

Generally, a data migration occurs during a hardware upgrade or transfer from an existing system to an entirely new one. Some examples are:

  • Update of a database.
  • Migration to or from the hardware platform.
  • Migration to new software.
  • Fusion of two parallel systems into one that is required when one company absorbs another or when two businesses merge.

In no case should the term migration of data be confused with others that, although similar, show essential differences in the number of sources of origin and destination of data or their diversity. Consolidation, integration or updating of data are different processes with different purposes.

What is data migration, what does it imply and how can it be carried out?

Data migration gets represented by the initials ETL, which correspond to the terms: extraction, transformation, and loading. Although an ETL process can get applied with other objectives, when considering what data migration is, it is inevitable to allude to its primary task: extraction and loading (since the transformation does not have to be applied in all cases, only if necessary).

There are three main options for carrying out data migration:

  • Combine the systems of the two companies or sources into a new one.
  • Migrate one of the systems to the other.
  • Maintain the integrity of both systems, leaving them intact, but creating a common vision for both: a data warehouse.

The most suitable tool to carry out a data migration is one of extraction, transformation and loading, as opposed to less productive options, such as manual coding; other inapplicable, such as the integration of applications (EAI) or others that do not provide everything necessary to carry out the process with full guarantees, as is the case of replication.

To carry out a data migration it is necessary to go through the following steps:

1. Planning– from the definition of the strategy and scope of the project to the feasibility analysis.

2. Analytical– considering variables such as the integrity, accuracy or consistency of the data to be migrated and taking into account the characteristics of the databases of origin and destination.

3. Application selection– can be developed internally or acquired after evaluating the different alternatives.

4. Testing– application of the test cycles to the applications that will use the database.

5. Migration– includes the extraction, transformation and loading stages.

6. Evaluation– it is about measuring the results and analyzing them, determining the necessary adjustments.

Challenges that all data migration must face

Although data migration can be a simple process, its implementation may encounter challenges that will have to be addressed.

  • Discover that the source code of the source application is not available and the manufacturer of that application is no longer on the market anymore.
  • Find types or formats of source data that have no correspondence in destination: numbers, dates, sub-registers.
  • Coding problems that affect certain datasets.
  • The existence of optimizations in the data storage format, such as encoded decimal binary storage, non-standard storage of positive/negative numerical values ​​or storage types from which mutually sub-registers are excluded within a record.
  • Issues related to the appearance of redundancies and duplications when, at the same time as data migration was carried out, different types of users used the old or the new system or application.

Dismantling the myths

Those who consider what data migration is may find themselves in a difficult position, susceptible to falling into extended beliefs but lacking solidity. Understanding the implications of migration involves discerning the myths that have nothing to do with it:

  • Data migration is not a simple process of copying data.
  • Data migration is not carried out in one sitting, it is a complex process that has its phases and requires time.
  • Data migration cannot be solved only from the outside, it is necessary and highly recommended to have the support of the owners of the data.
  • The transformation and validation of data can not, under any circumstances, occur after loading. It must always be done beforehand and the result must be subjected to cycles of tests that demonstrate its suitability to be loaded at the destination.

Better Practices

  • Give data profiling the importance it deserves.
  • Do not underestimate the data mapping.
  • Carry out the profiling tasks at the right time and never after loading.

  • Prefer automatic options to manuals for data profiling.
  • Take advantage of data migration to improve the quality of data and metadata.
  • Relying on data modeling techniques to optimize integration.
  • Keep in mind the operative facet of the data and try to simplify the future user interaction in administrative, reporting or update tasks.

A programming language is an artificial language designed to express computations that can be carried out by machines such as computers. They can be used to create programs that control the physical and logical behavior of a device, to express algorithms with precision, or as a mode of human communication.

Is formed of a set of symbols and syntactic and semantic rules that define its structure and the meaning of its elements and expressions. The process by which you write, test, debug, compile and maintain the source code of a computer program is called programming.

Also, the word programming gets defined as the process of creating a computer program, through the application of logical procedures, through the following steps:

  • The logical development of the program to solve a particular problem.
  • Writing the logic of the program using a specific programming language (program coding).
  • Assembly or compilation of the program until it becomes a machine language.
  • Testing and debugging the program.
  • Development of documentation.

There is a common error that treats the terms ‘programming language’ and ‘computer language’ by synonyms. Computer languages encompass programming languages and others, such as HTML. (language for the marking of web pages that is not properly a programming language but a set of instructions that allow designing the content and text of the documents)

It allows you to specify precisely what data a computer should operate, how it should be stored or transmitted, and what actions to take under a variety of circumstances. All this, through a language that tries to be relatively close to human or natural language, as is the case with the Lexicon language. A relevant characteristic of programming languages is precisely that more than one programmer can use a common set of instructions that are understood among them to carry out the construction of the program collaboratively.

The implementation of a language is what provides a way to run a program for a certain combination of software and hardware. There are basically two ways to implement a language: Compilation and interpretation. Compilation is the translation into a code that the machine can use. The translators that can perform this operation are called compilers. These, like advanced assembly programs, can generate many lines of machine code for each proposal of the source program.

Imperative and functional languages

The programming languages ​​are generally divided into two main groups based on the processing of their commands:

  • Imperative languages
  • Functional languages.

Imperative programming language

Through a series of commands, grouped into blocks and composed of conditional orders, it allows the program to return to a block of commands All this if the conditions get met. These were the first programming languages ​​in use, and even today many modern languages ​​use this principle.

However, structured imperative languages ​​lack flexibility due to the sequentiality of instructions.

Functional programming language

A functional programming language (often called procedural language) is a language that creates programs employing functions, returns a new result state and receives as input the result of other purposes. When a task invokes itself, we talk about recursion.

The programming languages ​​can, in general, get divided into two categories:

  • Interpreted languages
  • Compiled languages

Interpreted language

A programming language is, by definition, different from the machine language. Therefore, it must get translated so that the processor can understand it. A program written in an interpreted language requires an auxiliary program (the interpreter), which converts the commands of the programs as necessary.

Compiled language

A program written in a “compiled” language gets translated through an attached program called a compiler that, in turn, creates a new independent file that does not need any other program to run itself. This file is called executable.

Also, it has the advantage of not needing an attached program to be executed once it has compiled. Also, since only one translation is necessary, the execution becomes faster.

The interpreted language, being directly a readable language, makes that any person can know the manufacturing secrets of a program and, in this way, copy its code or even modify it.

Implementation

The implementation of a language is what provides a way to run a program for a certain combination of software and hardware. There are basically two ways to implement a language: Compilation and interpretation. Compilation is the translation into a code that the machine can use. The translators that can perform this operation are called compilers. These, like advanced assembly programs, can generate many lines of machine code for each proposal of the source program.

Technique

To write programs that provide the best results, a series of details must be taken into account.

  • Correction.  Programs are correct if they do what they should do as they got established in the phases before their development.
  • Clarity. It is essential that the program be as clear and legible as possible, to facilitate its development and subsequent maintenance. When developing a program, you should try to make its structure coherent and straightforward, as well as take care of the style in the edition; In this way, the work of the programmer is facilitated, both in the creation phase and in the subsequent steps of error correction, extensions, modifications, etc. Stages that can be carried out even by another programmer, with which clarity is even more necessary so that other programmers can continue the work efficiently.
  • Efficiency. The point is that the program does so by managing the resources it uses in the best possible way. Usually, when talking about the efficiency of a program, it is generally referred to the time it takes to perform the task for which it got created. And the amount of memory it needs, but other resources can also get considered when obtaining the efficiency of a program. It all depends on its nature (disk space it uses, network traffic it generates, etc.).
  • Portability. A program is portable when it can run on a platform, be it hardware or software, different from the one on which it got developed. Portability is a very desirable feature for a program, since it allows, for example, a program that has been designed for GNU / Linux systems to also run on the family of Windows operating systems. It will enable the program to reach more users more efficiently.

What is a VLAN?

According to Wikipedia, a VLAN, an acronym for virtual LAN (Virtual Local Area Network), is a method to create independent logical networks within the same physical network. The IEEE 802.1Q protocol is responsible for the labeling of the frames that is immediately associated with the VLAN information.

What does this mean? Well, it’s simple, it’s about logically dividing a physical network, you’ll understand it better with the following example:

Imagine a company with several departments in which you want them to be independent, that is, they can not exchange data through the network. The solution would be to use several switches, one per department, or to use a switch logically divided into small switches, that is precisely a VLAN. We already have the different departments separated, but now we need to give them access to services like the internet, the different servers, and more.

For this, we have two options:

  • Use a switch or layer 3 and 4 switch, that is, with the ability to “route” the different VLANs to a port.
  • Or use a firewall with VLAN support, that is, in the same physical interface, it allows to work with several VLANs as if it had several physical interfaces.

Types of VLANs

Level 1 VLAN

The level 1 VLAN defines a virtual network according to the port of the switch used, also known as “port switching.” It is the most common and implemented by most switches in the market.

Level 2 VLAN

This type of VLAN defines a virtual network according to the MAC addresses of the equipment. In contrast to the VLAN per port, it has the advantage that computers can change ports, but all MAC addresses must be assigned one by one.

Level 3 VLAN

When we talk about this type of VLAN it should be noted that there are different types of level 3 VLANs:

  • VLAN-based network address connects subnets according to the IP address of the computers.
  • Protocol-based VLAN allows creating a virtual network by type of protocol used. It is very beneficial to group all the computers that use the same protocol.

How does a VLAN work per port?

The IEEE 802.1Q protocol is responsible for the tagging (TAG) of the frames that gets immediately associated with the VLAN information. It consists of adding a tag or TAG to the header of the structure that indicates to which VLAN the frame belongs.

Based on the “tagged” VLANs, we can differentiate between:

  • TAGGED– When the connected device can work directly with VLAN, it will send the information of the VLAN to which it belongs. Thanks to this feature, the same port can work with several VLANs simultaneously.

When we configure a port with all the VLANs configured in TAGGED, we call it Trunk and it is used to join the network device in cascade. This system allows the packets of a VLAN to pass from one switch to another until finding all the equipment of said VLAN. Now we need to give them access to services like the internet, the different servers, and more.

For this, we have two options:

  • Use a switch or layer 3 or 4 switch, that is, with the ability to “route” the different VLANs to a port.
  • Or use a firewall with VLAN support, that is, in the same physical interface, it allows working with several VLANs as if it had several physical interfaces, each of which will give access to a VLAN to the services.

Choosing one or the other depends on whether the firewall used supports VLANs, if we pass communications through the firewall, we will always have more control over them, as I will explain later.

Advantages of segmenting your network using VLANs

The main benefits of using VLANs are the following:

  • Increase Security- By segmenting the network, groups that have sensitive data are separated from the rest of the net, reducing the possibility of breaches of confidential information.
  • Improve performance- By reducing and controlling the transmission of traffic on the network by division into broadcast domains, performance will be enhanced.
  • Reduction of costs- The cost savings result from the little need for expensive network upgrades and more efficient use of links and existing bandwidth.
  • The higher efficiency of the IT staff- The VLAN allows to define a new network over the physical network and to manage the network logically.

In this way, we will achieve greater flexibility in the administration and the changes of the network, since the architecture can be changed using the parameters of the switches, being able to:

  • Easily move workstations on the LAN.
  • Easily add workstations to the LAN.
  • Easily change the configuration of the LAN.

Advantages of having a firewall with VLAN support

  • More significant cost savings- We will not have to invest in a switch with “routing capacity,” and it will be worth a layer 2, currently very economical.
  • Greater security and control- We do not “route” one VLAN to another without any power, being able to create access rules between the VLANs and inspect all traffic.
  • The higher performance of the network- We will have the possibility to prioritize by QoS (Quality of service) specific VLANs or protocols.

Voice over IP (VoIP) traffic since it requires:

  • Guaranteed bandwidth to ensure voice quality
  • Priority of transmission over network traffic types
  • Ability to be routed in congested areas of the network
  • Delay of less than 150 milliseconds (ms) through the network

Therefore, as you have seen, having a Firewall with VLAN support supposes a series of significant advantages when managing your information systems. Not only will you get performance improvements, but you’ll also simplify your administration tasks.

When they are not administered, the data can become overwhelming, which makes it difficult to obtain the information that is needed at the time. Fortunately, we have software tools that, although designed to address data storage effectively, discovery, compliance, etc., have as a general objective to make the management and maintenance of data easy.

What is structured data?


When we talk about structured data, we refer to the information usually found in most databases. They are text files usually displayed in rows and columns with titles. They are data that can be easily ordered and processed by all data mining tools. We could see it as if it were a perfectly organized filing cabinet where everything can get identified, labeled and easily accessible.

It is likely that most organizations are familiar with this type of data and are already using it effectively, so let’s move on to see the unstructured data.


What is unstructured data?

Although it seems incredible, the database with structured information of a company does not even contain half of the information that is available in the company ready to be used. 80% of the information relevant to a business originates in an unstructured form, mainly in text format.

Unstructured data is usually binary data that has no identifiable internal structure. It is a massive and disorganized conglomerate of several objects that have no value until identified and stored in an organized manner.

Once organized, the elements that make up their content can be searched and categorized (at least to some extent) to obtain information.

For example, although most data mining tools are not capable of analyzing the information contained in email messages (however organized they may be), it is possible that collecting and classifying the data contained in them can show us relevant information for our organization. It is an example that illustrates the importance and scope of unstructured data.


But e-mail has no structure?

The unstructured term faces different opinions for various reasons. Some people say that although a formal structure cannot get identified in them, it is possible that it could be implicit and, in that case, it should not get categorized as unstructured. However, on the other hand, if the data have some form of structure, but this is not useful and can not be used to process them, they should be categorized as unstructured.

Although e-mail messages may contain information with some implicit structure, it is logical to think of them as unstructured information, since common data mining tools are not prepared to process and analyze them.

Unstructured data types

Unstructured data is raw and unorganized data. Ideally, all this information could be converted into structured data. However, it would be somewhat expensive and would require a lot of time. In addition, not all types of unstructured data can easily be converted into a structured model. For example, following the e-mail example, an e-mail contains information such as the time of sending, the person to whom it is sent, the sender, etc. However, the content of the message is not easily divided or categorized and this can be a problem of compatibility with the structure of a relational database system.

This is a limited list of unstructured data types:

  • Emails.
  • Text processor files.
  • PDF files.
  • Spreadsheets.
  • Digital images
  • Video.
  • Audio.
  • Publications in social media.


Looking at that list, you could ask what these files have in common. These are files that can be stored and managed without the system having to understand the format of the data. Since the content of these files does not get organized, they can get stored in an unstructured way.

Precisely many qualified voices in the sector suggest that it is unstructured information that offers greater knowledge. In any case, the analysis of data of different types is essential to improve both productivity and decision making in any company.

The Big Data industry continues to grow, but there is a problem with unstructured data that do not get used yet. However, the companies have already identified the problem and technologies and services are already being developed to help solve it.

The Stochastic Optimization seeks the best decision in a scenario dependent on random events, dependent on chance, whether those events the prices of a product, the duration of a task, the number of people in the queue of a cashier, the number of breakdowns in a fleet of trucks, or even the approval of a regulation, come on, anything.

Stochastic?

Stochastic is a particularly feared word. It is since is known that most functional languages ​​have been made a bad idea, by jealous experts who want to keep their secrets. There we have the legal jargon, the economic, or closer to our work; rumor has it that the creator of the C ++ language made it so complicated to differentiate good programmers from bad guys. The word “stochastic” is not dangerous; it means simply random, dependent on chance. The idea is quite simple, but as an adjective, it can complicate any discipline.

Problems of Stochastic Optimization

The problems of stochastic optimization are in general much more complicated than those that do not consider chance, mainly because luck implies that we do not have a single scenario to be optimized, but a set of possible scenarios. For example, if we want to maximize the design of an energy distribution network, we will work in an uncertainty scenario, in which we do not know the actual demand for energy at the time of use of the system. Instead of the demand data, we would have an estimate, perhaps a finite set of possible demands with an associated probability.

With this, we can already intuit that the world of the company is full of stochastic problems, what is usually done to solve them ?. In scenarios with simple decisions, that is to say, few decision variables and with few states, all the possibilities can be explicitly enumerated using decision trees that are also very intuitive.

Stochastic Optimization

Although it is considered that this discipline was born in the 50s with Dantzig and Beale, historically optimization has had enough to be restricted to non-stochastic problems, essentially due to the complexity that stochastic problems entail. Facing real problems is still impossible in many cases, but advances in computing capacity and the development of optimization techniques have allowed problems to be solved until recently unthinkable. In addition, sometimes, only a stochastic approach can greatly improve the solution, which translates into cost savings, service improvement, increased benefits, among others, all of them factors not insignificant.

An optimization problem has:

  • a series of variables or decisions that must be taken.
  • a series of restrictions that limit those decisions.
  • and an objective function, a measure of cost or quality of the set of decisions taken.

The data associated with the constraints and the objective function are usually known values, but what if random events gave these values? Then the problem is stochastic optimization.

There are two particularly uncomfortable questions:

What happens to feasibility when the restrictions are random?

A solution is feasible when it satisfies all the restrictions, but with arbitrary limits, we cannot speak strictly of feasibility but probability that a certain answer is viable. Thus, in the problem of planning the power distribution network, a restriction could be “the capacity of the distribution network is greater than or equal to the demand” but if the demand turns out to be very high, it can happen that we can not satisfy it.

How to redefine the objective function?

The answer to this question is less obvious, we could redefine the objective function as the expected value of the previous objective function, but it could be more interesting to reduce the risk of the decision using the worst case scenario. In general, these types of problems have been addressed using linear programming.

What are these problems complicated?

The complexity of these problems lies in their size. Think of the seemingly harmless lady who leaves the hairdresser. Suppose you also have to decide whether to go by bus or walk to do a message. Each new random variable and each new decision multiply the possibilities. Instead of considering four possible outcomes we should consider tens or hundreds.

If we continue to consider elements of uncertainty that may threaten the woman, the number of possible outcomes will increase exponentially. Let’s go back now to the problem of planning the energy distribution network; the demand is random, the future prices of energy are arbitrary, the production of wind energy is absolute, as well as the costs of fuels. The outcome of any decision in this context will depend on what happens with each of those random events.

Conclusion

Most of the real problems are of stochastic nature; there are few businesses in which all the data are known in advance, we cannot keep avoiding them. The stochastic optimization can allow us to face problems that until now have been solved by “intuition,” by “common sense,” or because “of all life has been done like this,” in a more efficient way, providing solutions that will place us in clear advantage over our competitors.

The strategic use of information gives companies a competitive response capacity that requires the search, management, and analysis of many data from different sources. Among this information, the secondary data have an essential weight when it comes to extracting value for use in research or studies.

Faced with primary information, created expressly for a specific study, the researcher also has secondary data, valid information already developed by other researchers that may be useful for particular research.

Likewise, these data may have been generated previously by the same researchers or, in general, by the same organization that conducts the study or, where appropriate, has commissioned it. That is why, as a general recommendation, the search should start with the internal data.

Regardless of whether they get obtained inside or outside the organization, the primary data generated in an investigation will be considered secondary data.  They can get used in others to save time and money, since it would not be feasible to carry them out for obvious budget issues or, just, it is unnecessary because it has already got done.

Internal and external secondary data

Once the search for internal information has to get completed, the researcher should focus on external secondary data sources, ideally following a previous plan that serves as a guide to a large number of sources available today.

Therefore, secondary information can get roughly divided into internal and external secondary data:

  • Internal secondary data– information that is available within the company is included, from accounting data or letters from customers or suppliers and vendor reports or surveys from the human resources department to, for example, previous research.
  • External secondary data– is data collected by sources external to the company. They can get found in other organizations or companies, such as census data, institutional statistics, government studies, organizations and associations, research and data disseminated in periodicals, in books, on the internet or, for example, the same digital data.

The growing importance of secondary information

Secondary data is more accessible to obtain, relatively inexpensive and available.

Although it is rare for secondary data to provide all the answers to an unusual research problem, such data may be useful for the investigation.

The use of secondary data in research processes is a common practice for years. However, with all this emergence of Big Data and the greater ease of access to different sources of information, its use has gained a strong impetus as a tool of business intelligence, mainly for the following reasons:

  • It is easy to access and economical information.
  • It serves as a point of comparison of the organizational results with respect to the market.
  • It serves to focus and define new organizational projects.
  • Allows estimation of quantitative benefits for new organizational projects (ROI)
  • It allows estimating future market behavior based on facts and data.
  • It facilitates the strategic decision making of organizations.

Among the disadvantages of the secondary data, we find that initially they could be investigated for different purposes to the current problem. It limits the information we can obtain and need for research.

It is likely that the objectives, nature, and methods used to collect the secondary data are not adequate for the present situation. Also, secondary data may be inaccurate or not completely current or reliable. Before using secondary data, it is important to evaluate them concerning such factors.

As a tool of great value, which helps to provide a clear competitive advantage, it is essential that organizations allocate technological and human resources to the establishment of processes aimed at the identification, selection, validation (verification of its accuracy, coherence, and credibility), processing and secondary information analysis.




The development of technologies and the Internet has dramatically increased the volume of data handled by large companies. Consequently, this has accelerated the evolution of data management models, until the creation of data governance. Data governance has, among other functions, to manage the data storage function, decide how, when and what gets stored.

The main challenges of data governance are:

  • Lack of human resources
  • Too much time spent cleaning and examining the data
  • Access to data
  • Lack of technological resources

The increase in the volume of data has brought technological challenges to deal with from storage to processing.

What is data storage?

The volume of data that a large company generates grows exponentially day by day. The data storage function seeks to meet the objectives set from data governance; implement good practices and policies on how, when and what is stored.

What is the big data?

Big data is a technology that allows the massive and continuous analysis of data, and that relies on data storage in a cloud (storage in the cloud); In addition, this technology allows solving some of the problems of data management or governance.

From data storage to big data

Here are some benefits of data storage in the cloud for big data:

Accessibility– the data that is in the cloud can be accessed from anywhere and at any time. A company with multiple branches can have its employees discuss projects and share information without having to gather them; employees can work from different places without losing competitiveness.


Reduction of costs– when a company invests in servers for storage, it incurs other expenses such as maintenance, security, personnel, IT consultants. While if data storage services are contracted in the cloud, costs can be reduced; thanks to the fact that you only pay for the storage consumed, optimizing the use of resources.


Optimization of space– local servers occupy spaces that can be allocated to productive areas. Big data helps data governance migrate to the cloud. The data storage service in the cloud optimizes the use of space. How much space can you occupy, the physical files and servers in a large company?


Maintenance– the maintenance of the servers of a company that provides services in the cloud, is not the responsibility of the contracting companies; so data management is freed from that responsibility. This in addition to saving time and money allows you to concentrate on other aspects of data management.


Security– companies that provide cloud storage services are at the forefront of information security technologies; what reduces threats and minimizes risks; Large companies that are often subject to cybercrime save resources thanks to the security of cloud storage.

The management of information in the workplace requires a rethinking in many areas and levels; In the human factor, whether managers or employees, everyone will have to improve the methods used. It is fundamental to change the vision, establish strategies and policies of information management and review what the market is offering in order to reach higher levels of competitiveness. It is essential that companies today allocate part of their investment to have an update in their work tools.

These traditional forms of work, do not respond adequately to the pace of massive data growth, and before this, there is a delay in the daily tasks of business. In the same way, consequently, the question arises.  Why not devote more to state-of-the-art technology and leave in its benefits a constant and avant-garde rating of our business? As a decision maker it is essential to update processes and policies year after year, enabling IT is a flexible, friendly and high potential way for your company to become a success story.

A database performance monitoring and management tools can be used to mitigate problems and help organizations to be more proactive so that they can avoid performance problems and interruptions.

Even the best-designed database experiences degradation of performance. No matter how well the database structures are defined or the SQL code gets written, things can and will go wrong. And if the performance problems are not corrected quickly, that can be detrimental to the profitability of a company.

Performance of a Database

When the performance of the database suffers, business processes within organizations slow down and end users complain. But that is not the worst of all. If the performance of the systems they see abroad is bad enough, companies can lose business, as customers who are tired of waiting for the applications to respond will go elsewhere.

Because the performance of database systems and applications can be affected by a variety of factors, the tools that can find and correct the causes of database performance problems are vital for organizations that rely on them in database management systems (DBMS) to run your mission-critical systems. And in today’s IT world, focused on databases, that applies to most companies.

Types of performance problems you should look for


Many types of database performance problems can make it difficult to locate the cause of individual problems. It is possible, for example, that the database structures or the application code are flawed from the beginning. Bad database design decisions and incorrectly encoded SQL statements can result in poor performance.

It may be that a system was well designed initially, but over time the changes caused the performance to begin to degrade. More data, more users or different patterns of data access can slow down even the best database applications. Even the maintenance of a DBMS – or the lack of regular maintenance of databases – can cause performance to plummet.


The following are three important indicators that could indicate database performance issues in your IT department:

1. Applications that go slower. The most important indication of potential performance problems in the database is when things that used to run fast start running at a slower pace. Including online transaction processing systems that are used by employees or customers, or batch jobs that process data in large quantities for tasks such as payroll processing and end-of-month reports.

Monitoring a processing workload without database performance management tools can become difficult. In that case, database administrators (DBAs) and performance analysts have to resort to other methods to detect problems, in particular, complaints from end users about issues such as application screens taking too much time to upload or nothing to happen for a long time after the information is entered into an application.

2. System interruptions. When a system is turned off, the performance of the database is obviously at its worst. Interruptions can be caused by database problems, such as running out of storage space due to increased volumes of data or by a resource that is not available, such as a data set, partition or package.

3. The need for frequent hardware updates. The constantly upgrading of servers to larger models with more memory and storage are often candidates for database performance optimization. Optimizing database parameters, tuning SQL statements and reorganizing database objects can be much less expensive than frequently updating expensive hardware and equipment.

On the other hand, sometimes hardware updates are needed to solve database performance problems. However, with the proper tools for monitoring and managing databases, it is possible to mitigate the costs of updating by locating the cause of the problem and identifying the appropriate measures to remedy it. For example, it may be cost-effective to add more memory or implement faster storage devices to resolve I / O bottlenecks that affect the performance of a database. And doing so will probably be cheaper than replacing an entire server.

Problems that tools can help you manage

When the performance problems of the database arise, it is unlikely that its exact cause will be immediately evident. A DBA should translate vague complaints about end-user issues into specific issues, related to performance, that can cause the problems described. It can be a difficult and error-prone process, especially without automated tools to guide the DBA.

The ability to collect the metrics on database usage and identify the specific problems of the database – how and when they occur – is perhaps the most compelling capability of the database performance tools. When faced with a performance complaint, the DBA can use a tool to highlight current and past critical conditions. Instead of having to look for the root cause of the problem manually, the software can quickly examine the database and diagnose possible problems.

Some, database performance tools can be used to set performance that, once triggered, alert the DBA of a problem or trigger an indicator on the screen. Also, DBAs can schedule reports on database performance to be executed at regular intervals, in an effort to identify the problems that need to be addressed. Advanced tools can both identify, and help solve any situations.

There are multiple variations of performance issues, and advanced performance management tools require a set of functionalities.

The critical capabilities provided by the database performance tools include

  • Performance review and SQL optimization.
  • Analysis of the effectiveness of existing indexes for SQL.
  • Display of storage space and disk defragmentation when necessary.
  • Observation and administration of the use of system resources.
  • Simulation of production in a test environment.
  • Analysis of the root cause of the performance problems of the databases.

The tools that monitor and manage the performance of databases are crucial components of an infrastructure that allows organizations to effectively deliver the service to their customers and end users.