Author

Maria Mendez

Browsing

A DNS server (Domain Name System), is a computer or a group of them connected to internet nodes, which have a database, our navigators consult regularly.
They work as a book of Internet addresses, resolve (translate) or convert domain names into IP addresses.

Not only browsers, but also mail programs when sending a message, mobile applications to operate, devices to connect to, and anything else that needs to find the address of a domain come to this server. They also have other functions.

Functions of DNS servers

Resolution of names

This term consists of returning the IP address that corresponds to a domain. Internet sites and services get identified by their numeric IP addresses, almost impossible to memorize by humans. For that reason, domain names were created. When requesting the browser for an address, it accesses the nearest DNS, which returns the IP corresponding to the requested site.

For example, when clicking on the link https://norfipc.com, we must wait  for the request to travel to the default DNS of the connection and return the result 31.22.7.120. Then can the browser request the indicated page from this site. Of course, after that, this relationship is saved in the cache for a while, to speed up subsequent queries.

Inverse address resolution

It is the reverse mechanism to the previous, from an IP address get the corresponding hostname.

Resolution of mail servers

Given a domain name (for example, gmail.com), obtain the server through which the e-mail delivery should be made.

The DNS Servers store a series of data for each domain, which is known as “DNS Record”.
The registers A, AAAA, CNAME, NS, MX, among others contain the IP addresses, host names, canonical names, associated email addresses, etc.

Main Internet DNS servers

There are thousands of DNS servers located on different internet nodes. Some get managed by ISPs (Internet supplying companies), others by large companies and there are even personal DNS. Some of them have a small database and queries about sites that got not included, are “passed on” to others that are hierarchically superior.

There are 13 DNS servers on the Internet that are known as the root servers, they store the information of the servers for each of the highest level areas and constitute the center of the network. They get identified with the first seven letters of the alphabet, several of them are physically divided and geographically dispersed, a technique known as “anycast,” with the purpose of increasing performance and safety.

Delay in name resolution

When trying to access with our browser to a website that we had never been before, also little known and that is on a remote server, the request gets made to the default DNS server of our connection, which 80% of the time It’s from a telephone company.

This DNS is generally slow and with little information. The request will send it to another DNS of higher rank and so on until it succeeds.  If the application gets delayed for a certain amount of time, the browser will consider it an error and close the connection.

Errors and censorship in DNS

In addition to the slowness caused by poor quality DNS and poor performance, other factors conspire against the quality of navigation. One of them is errors in the resolution of names when it seems that the sites or internet services do not work and it does not. Another is the use of DNS to censor or block websites, an extended method in some countries.

Alternate internet DNS servers

Due to the difficulties explained above, the use of alternate servers on the Internet has become popular. They are independent services to providers, which generally offer free services, which often include the filtering of inappropriate or dangerous content, such as malware sites or adult-only content. The main ones offer much smaller response times than telephone companies, which considerably increases the quality and performance of navigation. The best known of these is the Google DNS Public Server, whose IP address is: 8.8.8.8.

How to know the DNS servers of our connection

  1. Open Start, type CMD and press the Enter key to open the CMD Console or Command Prompt.
  1. In the black window write the command NSLOOKUP and press Enter again. The application will return the hostname and the IP address of the established DNS, as you can see in the following image.

Conventional cloud storage services are increasingly expensive and do not offer more significant incentives for their users, in addition to reducing the possibilities in data transfer. Also, because they are centralized services, they can be unreliable regarding their ability to preserve the integrity of the data.

Massive and Decentralized Storage of Information

One of the most disruptive applications of active crypto technology is the massive and decentralized storage of information. Decentralization being a concept that has hovered in various areas of communications, business, and social organization, Bitcoin technology presents the world with an option, even in the experimental phase, combining decentralized and permanent records, transparency and security with a system of incentives for the maintenance of the network.

On the other hand, data leakage has been a constant in the history of the internet, so companies or users that handle content that they consider should be protected, are migrating to crypto active networks as an effective and innovative solution for this. If the information gets stored in a single node, there would be the risk of losing it forever if that central base fails.

Blockchain Networks

Thus, various platforms and implementations dedicated to safeguarding the information of those users who do not have enough storage space have decided to place their trust in these protocols. However, we must remember that blockchain platforms are still projects in development, so it is convenient to keep track of them to avoid failures or bad practices that put our data in check.

In these blockchain networks, the information is protected in a shared way by multiple servers located around the world, who keep a copy of the chain of blocks. Also, decentralization allows the client or user to make transactions with your information or even edit it if you have the private keys unique to that record.

Somehow, you can compare these decentralized networks with the torrent services that are so popular to download movies, books, music, and many files. Working with a P2P logic, in the BitTorrent client a large number of users save a file and keep it online available to those who want to download it. The data can get duplicated, modified and distributed endless times.

One of the differences between torrent service and crypto active technology is that the former was not designed with a system of monetary incentives, and the work of those who participate in it are kind.

FileCoin

FileCoin is a cryptocurrency and protocol that works as a solution for data storage. Developed by Protocol Labs, the cryptocurrency is executed on top of the Interplanetary File System, seeking to create new ways to store and share information on the Internet.

However, its difference with web protocols lies in that instead of storing the files in a centralized URL; its routing algorithm allows you to obtain the content from any place or channel that connects to the nodes of your network.

Through a hash address, the content becomes immutable and gets protected against the decisions of third parties who may not want that content to exist or be visible to the public. Also, it allows the user to configure the levels of privacy from making the entire file visible until it is shared promptly with whomever he wishes.

Another advantage that allows the distribution of files through this network is that it is not only a server that stores information, but it gets fragmented between different nodes and users located around the world, independent and separated. In this way, users can rent their spare storage space to safeguard files from third parties and receive a reward for it, obtaining FileCoins for their work.

This operation is common to all the platforms in this list.

SIA

Sia is a protocol that emerged from the HackMIT event in 2013, a student meeting where different types of projects are developed and presented. Officially, Sia was launched in 2015 and also seeks to use the capacity of the memory units to create a decentralized mass storage market powered by the Siacoin currency.

STORJ

Storj is a distributed storage project built on the Ethereum network. It is one of the most popular services of this type, with an active and large community of about 20,000 users and 19,000 guests, which gets reflected in its position as a market leader among all similar projects for mass distributed storage.

SWARM

In the case of Swarm, it is not a blockchain protocol or platform, but rather a technical implementation of Ethereum for data storage. This tool will get activated in conjunction with the Whisper messaging service and the Ethereum Virtual Machine (EVM).

It should get noted that it is still an implementation in development since Ethereum’s team of collaborators continues to attend to various scalability solutions, so it will progressively come at some point.

MAIDSAFE

Maidsafe is a company established in the United Kingdom in charge of implementing the SAFE Network, a decentralized network that uses the Resource Test as a consensus mechanism to store information.

Given its age, MaidSafe gets distinguished from other crypto projects in having much more time as an enterprise, one of the first to propose decentralization as a key to creating the internet of the future.

In theory, each computer queries a node randomly about the information collected and then disseminates it throughout the network allowing other servers to build an image of what is happening in real time.

Cloud computing is a process that gets increasingly welcomed within the IT areas. In fact, according to a survey published in Forbes magazine, by 2020 83% of the workload in companies will be in the cloud. It is a considerable figure, especially if you consider that until recently the term cloud or cloud computing was unknown or you did not know what it was referring. Today business practices require not only their knowledge and use but migration to this type of models due to the benefits in costs and performance, among others.

What is Cloud Computing?

Cloud computing or cloud computing is a technology by which the resources of the local computer get dispensed with and the computational storage capacity and internet-based storage – the cloud – is exploited. In this measure, only an internet connection is necessary to access resources that the local user does not have.

Now, from a more conceptual perspective, cloud computing ends up being a change in the paradigm since it proposes a panorama in which access to information and technological infrastructure is practically ubiquitous. In this order of ideas, a manager can review and modify the progress of a project in real time from his cell phone, that is without the need to have the technological infrastructure in situ, from anywhere in the world.

The term cloud is used as a metaphor for the internet because flowcharts usually represent it with this figure. However, cloud computing is a term attributed to John McCarthy, who is 1961 was the first to mention the idea that computer time-sharing technology could lead to later processing power could sell as a service.

How does cloud computing work?

In principle, the essential element in cloud computing is the cloud itself, i.e., the internet. Based on this, let’s illustrate an example in which a user decides to work with an X provider. Once he accepts the terms and conditions, he has access to the computing power that said provider offers him; be storage space, high demand processing power or even software or platform.

Despite appearing distant, this cloud is closer than you think. It is very likely that you are in it right now. The top 5 apps in the cloud for consumers currently are:

  1. Facebook
  2. Twitter
  3. YouTube
  4. LinkedIn
  5. Pinterest

An example of a cloud storage service that you have surely used is Google Drive. This is just one of the many features of the Google Suite. Through it, a user can store from 15 GB to 10 TB.

Where is the cloud?


It is clear that all resources in the cloud are tangible, therefore physical and “real”. In this order of ideas, the cloud services are located in the offices of the provider that the user has chosen, such as the Google or Facebook offices.

Cloud types

There are 3 main types of cloud:

Private- Private clouds are those that offer computer services through a private internal network, exclusive to some users and not available to the general public. It is also known as an internal or corporate cloud.

Public-
The public cloud is those computer services that are offered by external providers through the Internet. Therefore, they are available to everyone.

Hybrid-
This type of cloud combines both characteristics, which allows a dynamic between clouds, depending on the needs and the costs that get counted. This solution is the most flexible of all.

Now, there are a series of categories within the clouds described above which are:

  • Software as a Service (SaaS)
  • Platform as Services (PaaS)
  • Infrastructure as a service (IaaS)

Benefits of cloud computing

It is important to keep in mind that although cloud services offer many benefits, they depend on the nature of the company that wants to implement them. In this order of ideas, some operations may not be as convenient for IT areas. The main benefits are:

  • Investment costs- because there is no need to invest in infrastructure, the initial investment costs are much lower.
  • “Unlimited” resources- the resources that can get hired in the cloud are practically unlimited. That is, you can always access more storage space, more processing power or more robust applications.

  • Zero maintenance- since the entire infrastructure is in charge of a third party, the IT areas are focused on more operational functions, instead of high maintenance and updating processes.
  • Security- in case of being considered a public cloud, providers usually have the most robust security systems available in the market. In this way, any cyber attacks get avoided.
  • Information security- by having the information hosted on servers of suppliers with extensive infrastructure, the processes of backup of data (backup) are constant, so it is practically impossible that there is the loss of data.

Conclusions

The cloud is here to stay. Mobility, access, and flexibility are essential characteristics of today’s managers. In this order of ideas, it is necessary to be at the forefront and create strategic alliances with suppliers of importance in this type of service. From this point of view, the Google suite is by far the best ally regarding costs, implementation and above all, innovation, not in vain is the largest Internet company on the market today.

Grid Computing is created to provide a solution to specific issues, such as problems that require a large number of processing cycles or access to a large amount of data. Finding hardware and software that allows these utilities to get provided commonly provides cost, security, and availability issues. In that sense, different types of machines and resources get integrated. Therefore a grid network is never obsolete, and all funds get used. If all the PCs of an office get renewed, the old and the new ones can be incorporated.

On the other hand, this technology gives companies the benefit of speed, which is a competitive advantage, which provides an improvement in the times for the production of new products and services.

Advantages and Disadvantages

It facilitates the possibility of sharing, accessing and managing information, through collaboration and operational flexibility, combining not only different technological resources but also diverse people and skills.

Regarding security in the grid, this is supported by the “intergrids,” where that security is the same as that offered by the Lan network on which grid technology gets used.

The parallelism can be seen as a problem since a parallel machine is costly. But, if we have availability of a set of heterogeneous devices of small or medium size, whose aggregate computational power is considerable, this would allow generating distributed systems of meager cost and significant computational power.

Grid computing needs different services such as the Internet, 24-hour connections, 365 days, broadband, capacity servers, computer security, VPN, firewalls, encryption, secure communications, security policies, ISO standards, and some more features … Without all these functions and features it is not possible to talk about Grid Computing.

Fault tolerance means that if one of the machines that are part of the grid collapses, the system recognizes it and the task gets forwarded to another device, which fulfills the objective of creating flexible and resistant operational infrastructures.

Applications of Grid Computing

Currently, there are five general applications for Grid Computing:

  • Super distributed computing- They are those applications whose needs can not get met in a single node. The needs occur at specific times of time and consume many resources.
  • Systems distributed in real time- They are applications that generate a flow of data at high speed that must be analyzed and processed in real time.
  • Specific services- Here we do not take into account the computing power and storage capacity but the resources that an organization can consider as not necessary. Grid presents these resources to the organization.
  • The intensive process of data- Are those applications that make great use of storage space. These types of applications overwhelm the storage capacity of a single node, and the data gets distributed throughout the grid. In addition to the benefits of the increase in space, the distribution of data along the grid allows access to them in a distributed manner.
  • Virtual collaboration environments- Area associated with the concept of Tele-immersion. So that the substantial computational resources of the grid and its distributed nature are used to generate distributed 3D virtual environments.

There are real applications that make use of mini-grids, which gets focused on the field of research in the field of physical sciences, medical and information processing. Also, there are various applications in the field of road safety. For example, this system allows translating the risk of injuring a pedestrian and the bumper resistance of a vehicle into a series of data that help design the most appropriate protection solution.

Among the first grid projects, Information Power Grid (IPG) emerged, which allows the integration and management of resources from NASA centers. The SETI @ Home project worldwide, of extra-terrestrial life research, or search for intelligent life in space, can be considered as a precursor of this technology. Although the idea of ​​Grid Computing is much more ambitious since not only, it is about sharing CPU cycles to perform complex calculations. But it is looking for the creation of a distributed computing infrastructure, with the interconnection of different networks, the definition of standards, development of procedures for the construction of applications, etc.

Computer science is, in short, the study of information (“data”), and how to manipulate it (“algorithms”) to solve problems. Mostly in theory, but sometimes also in practice.

You have to know that computer science is not the study of computers. Nor do they strictly need the use of computers. Data and algorithms can get processed with paper and pencil. Computer science is very similar to mathematics. So now many people prefer to call the subject “Computer.”

Often, computer science is confused with three fields, which are related but which are not the same.

Three Fields

  • Computer engineering- involves the study of data and algorithms but in the context of computer hardware. How do electrical components communicate? How to design microprocessors? How to implement efficient chips?
  • Software engineering- You can think of this branch as “applied computer science,” where computer scientists create abstract theories, while software engineers write real-world programs that combine theory with algorithms.
  • Information technology- This branch involves the software and hardware created so far. IT professionals help maintain networks and assist when others have problems with their devices or programs.

The disciplines of computer science

If you plan to study Computer Science, you should know that there are not two universities in the world that have the same curriculum. Universities can not agree on what “informatics” covers. Nor do they manage to decide which disciplines belong to the category of computer science.

  • Bioinformatics- It includes the use of information technology to measure, analyze and understand the complexity of biology. It involves the analysis of extensive data, molecular models and data simulators.
  • Theory of the computation- It is the study of algorithms and applied mathematics. It is not just about the creation of new algorithms or the implementation of existing algorithms. It is also about the discovery of new methods and the production of possible theorems.
  • Graphics computing- Is responsible for studying how data can get manipulated and transformed into visual representations that a human being understands. That includes themes such as realistic photo images, dynamic image generation, modeling, and 3D animation.
  • Video game development- It refers to the creation of entertainment games for PC, web or mobile devices. Graphics engines often involve unique algorithms and data structures optimized for real-time interaction.
  • Networks- Consists of the study of distributed computer systems. And how communications can get improved within and between networks.
  • Robotics- It deals with the creation of algorithms that control machines. It includes research to improve the interaction between robots and humans — interactions of robots with robots. And interactions with the environment.
  • Computer security- It deals with the development of algorithms to protect applications or software from intruders, malware or spam. It includes computer security, security in the cloud and the network.

A university degree should teach you at least the following:

  1. How computer systems work at the software and hardware level.
  1. How to write code in different programming languages.
  1. How to apply algorithms and data structures naturally.
  1. Mathematical concepts, such as graphics theory or formal logic.
  1. How to design a compiler, an operating system, and a computer.

Problem-solving is the primary skill to be developed by any computer scientist, software engineer or computer scientist. If you are not curious and you are not attracted to solving things, then you will not be pleased studying this career.

Also, technology is one of the fastest growing fields in the world so if you do not want to be at the forefront of new technologies, new programming languages, new devices, etc.

The formulas to turn enormous amounts of data into information with economic value become the great asset of the multinationals.

Algorithms are a set of programming instructions that, logically introduced in software, allow to analyze a set of previously selected data and establish an “output” or solution. These algorithms are being used by companies mainly to detect patterns or trends, and based on this, generate useful data to adapt their products or services better.

It is not a novelty for companies to obtain data from advanced analytics to study the characteristics of the product they plan to put on the market; the price to which it wants to place it or even private decisions as sensitive as the remuneration policy for its employees. The surprising thing is the dimension.

It is not only that the number of data in circulation has recently multiplied to volumes that are difficult to imagine – it is estimated that humanity has generated 90% of the information of the whole history in the last five years. The possibilities of interconnecting them have also grown dramatically.

Algorithm revolution

This revolution has contributed to each of the millions of people who give their data every day for free and continuously, either uploading a photo to Facebook, buying with a credit card or going through the metro turnstiles with a magnetic card.

In the heat of giants like Facebook and Google, who base their enormous power on the combination of data and algorithms, more and more companies are investing increasing amounts of money in everything related to big data. It is the case of BBVA, whose bet is aimed both at invisible projects for customers -as the engines that allow processing more information to analyze the needs of its users- and at other easily identifiable initiatives, such as the one that enables bank customers to. Forecast the situation of your finances at the end of the month.

Dangers and Risks


The vast possibilities offered by the algorithms are not without risks. The dangers are many: they range from cybersecurity – to deal with hacking or theft of formulas – to the privacy of the users, going through the possible biases of the machines.

Thus, a recent study by the University Carlos III concluded that Facebook uses advertising for sensitive data of 25% of European citizens, who get tagged in the social network according to matters as private as their political ideology, sexual orientation, religion, ethnicity or health.
Cybersecurity, for its part, has become the primary concern of investors around the world: 41% said they were “apprehensive” about this issue, according to the Global Investors Survey of 2018.

What is the future of the algorithms?

This technology is fully functional to meet the objectives of almost any organization today, and although we do not know, is present in many well-known firms in the market. Its capabilities of analysis, prediction and report generation for decision making make it a powerful strategic tool.

Algorithms, either through specific applications or with the help of Business Intelligence or Big Data solutions open the way to take advantage of the information available in our company and turn it into business opportunities.

Thanks to the algorithms we know better how our clients and prospects behave, what they need, what they expect from us. And they also allow us to anticipate the actions of our competitors and market trends.

Like any technological innovation that has revolutionized our way of understanding the world since man is a man, it will take us some time to become aware of this new reality and learn to make the most of it. As citizens and as communicators we can turn algorithms into valuable allies.

The algorithm is at the heart of technologies potentially as powerful as artificial intelligence. Nowadays, algorithms are the basis of machine learning technologies, which surprise us every day with new skills. And it is behind techniques of the setting of virtual assistants or autonomous vehicles.

The data visualization allows us to interpret information in a simple and very visual way. Its primary objective is to communicate information clearly through graphics, diagrams, or infographics.

Sometimes, we are not aware of the importance of data in our routine life. We believe that it is something close to the professional world when, for example, simple indicators such as the percentage of your mobile’s battery or your car’s consumption data that will allow you to save fuel are fundamental.

At a professional level, the reading of data and its graphics visualization is a priority. Because at the end of the day they are the indicators that allow us to understand the tendency of the results. This, whether we are improving, maintaining the line or, on the contrary, getting worse in the tasks carried out by the different work team. Since at the end of it depends directly on the scope or not of the marked business objectives. Therefore, it is necessary to monitor these data constantly, to have a diagnosis of the company’s health at the moment.

The best way is to translate the data into a visual, graphic image, through some of the best tools available in the market. Most work in a similar way, importing the data, offering different ways of viewing and publishing them; all this with a simple usability level, according to people who are not experts in the field and with the necessary adaptation so that they can get seen in the different technological formats available in the market, including mobile ones.

Here are some and their main features:

Data Studio (Google)


The Californian giant is present in a leading role in the data visualization market thanks to Google Data, a free and easy to use tool. It connects with other means such as Google Analytics or Adwords, and through payment, you can also do it with others such as Facebook. It is accessed through the browser without the need to install additional software.

Tableau


It is a favorite Business Intelligence tool that allows the interactive visualization of data. It is an ideal option for all audiences, whatever the purpose, since through its website they offer good tutorials to familiarize yourself with it. It only requires the initial investment in the license that best suits your needs after the end of the trial period. It meets all levels of demand and is a great choice also as a partner for corporate purposes.

Power BI


Microsoft also designed a set of tools dedicated to BI, from an editor and data modeling to visualization applications. It requires the download of software that fits your operating system and has a free version that can get expanded with personalized payment packages. It is intuitive and powerful, but not as simple to use as others in this list of options, hence it is focused mainly on business purposes of a particular demand.

Datawrapper


Another free tool that offers a wide range of solutions to visualize imported data, from simple bar graphs too much more complex options.

Infogram


This tool is a favorite especially among the media and educational purposes because their graphics can be added elements to the consumer’s taste as templates, icons, and even images and videos.

Qlikview

It has a free version that allows analyzing and creating dashboards, as well as manipulating and interacting with the information. The special features are limited to your payment service which you can access in test mode for free. It is a support that allows you to develop connections with other intermediate applications so that knowledge of programming languages ​​will enable you to squeeze it much better.

Picktochart


It is a data visualization tool specialized in infographics — thousands of templates and elements to create them in a personalized way that can be downloaded in different high-resolution formats or shared in an interactive way.

Chartblocks

It is a more modest tool, but that according to your needs can be enough because it allows you to create graphics with great simplicity and then share them and display them in high resolution in any format.

The best thing, even if they all work similarly, is to choose the one that best meets the demands you need. It is not the same to look for a tool that allows you to build simple graphs that require advanced business intelligence functions. Therefore, within the list, there are eight options with different levels of development and functionalities. In each of its web pages, you can deepen more about them before opting for one.

In the information age, data has become an essential and essential element for any brand that wants to develop a precise and effective strategy and achieve the engagement of its target.

For this, many companies invest a lot of money in recruiting the best talent in this field, but when it comes to choosing which is better, a data scientist or a data analyst? And more importantly, do companies know what the difference between them is?

Although both professions are vital for the marketer world, it is essential to understand the differences between their jobs depending on the approach you want to give to a strategy. The truth is that the industry tends to name these professionals indistinctly and has generated a confusion that we want to clear up.

Advent of the data scientist

Companies saw the availability of large volumes of data as a source of competitive advantage and realized that if they used this data effectively, they would make better decisions and be ahead of the growth curve. The need arose for a new set of skills that included the ability to draw client/user perceptions, business acumen, analytical skills, programming skills, analytical skills, machine learning skills, visualization of data and much more. It led to the emergence of a data scientist.

Data scientists and Data analysts

Data scientist– You probably have a strong business sense and the ability to communicate effectively, data-driven conclusions to business stakeholders. A data scientist will not only deal with business problems but will also select the right issues that have the most value to the organization.

A data scientist and an analyst can take Big Data analytics and Data Warehousing programs to the next level. They can help decipher what the data is saying to a company. They are also able to segregate relevant data from irrelevant data. A data scientist and an analyst can take advantage of the company’s data warehouse to go deeper into them. Therefore, organizations must know the difference between data scientists and data analysts.

Data scientists are a kind of evolution of the role of analysts but focus on the use of data to establish global trends on the problems of a company to solve them and improve business strategy.

Data Analyst– Your job is to find patterns and trends in the historical data of an organization. Although BI relies heavily on the exploration of past trends, the science of data lies in finding predictors and the importance behind those trends. Therefore, the primary objective of a BI analyst is to evaluate the impact of certain events in a business line or compare the performance of a company with that of other companies in the same market.

The data analyst has the primary function of collecting data, studying it and giving it a meaning. It is a process that can vary depending on the organization for which you work, but the objective is always the same, to give value and meaning to some data that by itself has no use. Thus, the result of analyzing, extrapolating and concluding is a piece of relevant information by itself, comparable with other data and use to educate other industry professionals about its applications.

An analyst usually relies on a single source of data such as the CRM system while a data scientist can conclude from different sources of information that may not be connected.

Main differences between the two

  • Usually, a data scientist expects to ask questions that can help companies solve their problems, while a BI data analyst answers and answers questions from the business team.
  • It is expected that both functions write queries, work with engineering teams to obtain the correct data and concentrate on deriving information from the data. However, in most cases, a BI data analyst is not expected to construct statistical models. A BI data analyst typically works on simpler SQL databases or similar databases or with other BI tools/packages.
  • The role of the data scientist requires strong data visualization skills and must have the ability to convert data into a business history. Typically, a BI data analyst is not expected to be an expert in business and advanced data visualization.

Companies must know how to distinguish between these two functions and the areas in which a data scientist and a business analyst can add value.

Information is an essential asset for any organization and the potential of its value lies in the data that, on occasion, must be migrated to improve the performance of a database, update versions, reduce costs or implement security policies.

But what is data migration?

This process consists of the transfer of data from one system to another and usually takes place at times of transition caused by the arrival of a new application, a change in the mode or storage medium or the needs imposed by the maintenance of the base of corporate data.

Generally, a data migration occurs during a hardware upgrade or transfer from an existing system to an entirely new one. Some examples are:

  • Update of a database.
  • Migration to or from the hardware platform.
  • Migration to new software.
  • Fusion of two parallel systems into one that is required when one company absorbs another or when two businesses merge.

In no case should the term migration of data be confused with others that, although similar, show essential differences in the number of sources of origin and destination of data or their diversity. Consolidation, integration or updating of data are different processes with different purposes.

What is data migration, what does it imply and how can it be carried out?

Data migration gets represented by the initials ETL, which correspond to the terms: extraction, transformation, and loading. Although an ETL process can get applied with other objectives, when considering what data migration is, it is inevitable to allude to its primary task: extraction and loading (since the transformation does not have to be applied in all cases, only if necessary).

There are three main options for carrying out data migration:

  • Combine the systems of the two companies or sources into a new one.
  • Migrate one of the systems to the other.
  • Maintain the integrity of both systems, leaving them intact, but creating a common vision for both: a data warehouse.

The most suitable tool to carry out a data migration is one of extraction, transformation and loading, as opposed to less productive options, such as manual coding; other inapplicable, such as the integration of applications (EAI) or others that do not provide everything necessary to carry out the process with full guarantees, as is the case of replication.

To carry out a data migration it is necessary to go through the following steps:

1. Planning– from the definition of the strategy and scope of the project to the feasibility analysis.

2. Analytical– considering variables such as the integrity, accuracy or consistency of the data to be migrated and taking into account the characteristics of the databases of origin and destination.

3. Application selection– can be developed internally or acquired after evaluating the different alternatives.

4. Testing– application of the test cycles to the applications that will use the database.

5. Migration– includes the extraction, transformation and loading stages.

6. Evaluation– it is about measuring the results and analyzing them, determining the necessary adjustments.

Challenges that all data migration must face

Although data migration can be a simple process, its implementation may encounter challenges that will have to be addressed.

  • Discover that the source code of the source application is not available and the manufacturer of that application is no longer on the market anymore.
  • Find types or formats of source data that have no correspondence in destination: numbers, dates, sub-registers.
  • Coding problems that affect certain datasets.
  • The existence of optimizations in the data storage format, such as encoded decimal binary storage, non-standard storage of positive/negative numerical values ​​or storage types from which mutually sub-registers are excluded within a record.
  • Issues related to the appearance of redundancies and duplications when, at the same time as data migration was carried out, different types of users used the old or the new system or application.

Dismantling the myths

Those who consider what data migration is may find themselves in a difficult position, susceptible to falling into extended beliefs but lacking solidity. Understanding the implications of migration involves discerning the myths that have nothing to do with it:

  • Data migration is not a simple process of copying data.
  • Data migration is not carried out in one sitting, it is a complex process that has its phases and requires time.
  • Data migration cannot be solved only from the outside, it is necessary and highly recommended to have the support of the owners of the data.
  • The transformation and validation of data can not, under any circumstances, occur after loading. It must always be done beforehand and the result must be subjected to cycles of tests that demonstrate its suitability to be loaded at the destination.

Better Practices

  • Give data profiling the importance it deserves.
  • Do not underestimate the data mapping.
  • Carry out the profiling tasks at the right time and never after loading.

  • Prefer automatic options to manuals for data profiling.
  • Take advantage of data migration to improve the quality of data and metadata.
  • Relying on data modeling techniques to optimize integration.
  • Keep in mind the operative facet of the data and try to simplify the future user interaction in administrative, reporting or update tasks.

A programming language is an artificial language designed to express computations that can be carried out by machines such as computers. They can be used to create programs that control the physical and logical behavior of a device, to express algorithms with precision, or as a mode of human communication.

Is formed of a set of symbols and syntactic and semantic rules that define its structure and the meaning of its elements and expressions. The process by which you write, test, debug, compile and maintain the source code of a computer program is called programming.

Also, the word programming gets defined as the process of creating a computer program, through the application of logical procedures, through the following steps:

  • The logical development of the program to solve a particular problem.
  • Writing the logic of the program using a specific programming language (program coding).
  • Assembly or compilation of the program until it becomes a machine language.
  • Testing and debugging the program.
  • Development of documentation.

There is a common error that treats the terms ‘programming language’ and ‘computer language’ by synonyms. Computer languages encompass programming languages and others, such as HTML. (language for the marking of web pages that is not properly a programming language but a set of instructions that allow designing the content and text of the documents)

It allows you to specify precisely what data a computer should operate, how it should be stored or transmitted, and what actions to take under a variety of circumstances. All this, through a language that tries to be relatively close to human or natural language, as is the case with the Lexicon language. A relevant characteristic of programming languages is precisely that more than one programmer can use a common set of instructions that are understood among them to carry out the construction of the program collaboratively.

The implementation of a language is what provides a way to run a program for a certain combination of software and hardware. There are basically two ways to implement a language: Compilation and interpretation. Compilation is the translation into a code that the machine can use. The translators that can perform this operation are called compilers. These, like advanced assembly programs, can generate many lines of machine code for each proposal of the source program.

Imperative and functional languages

The programming languages ​​are generally divided into two main groups based on the processing of their commands:

  • Imperative languages
  • Functional languages.

Imperative programming language

Through a series of commands, grouped into blocks and composed of conditional orders, it allows the program to return to a block of commands All this if the conditions get met. These were the first programming languages ​​in use, and even today many modern languages ​​use this principle.

However, structured imperative languages ​​lack flexibility due to the sequentiality of instructions.

Functional programming language

A functional programming language (often called procedural language) is a language that creates programs employing functions, returns a new result state and receives as input the result of other purposes. When a task invokes itself, we talk about recursion.

The programming languages ​​can, in general, get divided into two categories:

  • Interpreted languages
  • Compiled languages

Interpreted language

A programming language is, by definition, different from the machine language. Therefore, it must get translated so that the processor can understand it. A program written in an interpreted language requires an auxiliary program (the interpreter), which converts the commands of the programs as necessary.

Compiled language

A program written in a “compiled” language gets translated through an attached program called a compiler that, in turn, creates a new independent file that does not need any other program to run itself. This file is called executable.

Also, it has the advantage of not needing an attached program to be executed once it has compiled. Also, since only one translation is necessary, the execution becomes faster.

The interpreted language, being directly a readable language, makes that any person can know the manufacturing secrets of a program and, in this way, copy its code or even modify it.

Implementation

The implementation of a language is what provides a way to run a program for a certain combination of software and hardware. There are basically two ways to implement a language: Compilation and interpretation. Compilation is the translation into a code that the machine can use. The translators that can perform this operation are called compilers. These, like advanced assembly programs, can generate many lines of machine code for each proposal of the source program.

Technique

To write programs that provide the best results, a series of details must be taken into account.

  • Correction.  Programs are correct if they do what they should do as they got established in the phases before their development.
  • Clarity. It is essential that the program be as clear and legible as possible, to facilitate its development and subsequent maintenance. When developing a program, you should try to make its structure coherent and straightforward, as well as take care of the style in the edition; In this way, the work of the programmer is facilitated, both in the creation phase and in the subsequent steps of error correction, extensions, modifications, etc. Stages that can be carried out even by another programmer, with which clarity is even more necessary so that other programmers can continue the work efficiently.
  • Efficiency. The point is that the program does so by managing the resources it uses in the best possible way. Usually, when talking about the efficiency of a program, it is generally referred to the time it takes to perform the task for which it got created. And the amount of memory it needs, but other resources can also get considered when obtaining the efficiency of a program. It all depends on its nature (disk space it uses, network traffic it generates, etc.).
  • Portability. A program is portable when it can run on a platform, be it hardware or software, different from the one on which it got developed. Portability is a very desirable feature for a program, since it allows, for example, a program that has been designed for GNU / Linux systems to also run on the family of Windows operating systems. It will enable the program to reach more users more efficiently.