DATA DIGITALIZATION ~ ALL ROUND SCHOLARSHIP NEWS

Definition of Digitlization
Digitalization is the process of converting information into a digital (i.e. computer-readable) format, in which the information is organized into bits.
In modern practice, the digitized data is in the form of binary numbers, which facilitate computer processing and other operations, but, strictly speaking, digitizing simply means the conversion of analog source material into a numerical format; the decimal or any other number system that can be used instead.

PROCESS OF DIGITIZATION
There are two ways in which data is being converted into machine language
Whatever thing you wish to represent in a computer, you need to find a way of converting it into numbers. This conversion process is sometimes completely faithful, meaning you can recover the original object precisely from the numbers, or it can be an approximation. In the latter case, the digital representation of your original object is incomplete in some ways, and the trick is to make it close enough in the areas that matter, meaning close enough so that under ordinary circumstances, we can hardly tell the difference, or not at all.

Text

Text files are a simple example of an object that can generally be represented faithfully. A text file is just a sequence of letters in some language and other characters (spaces, punctuation marks, maybe a few special characters). The first order of business is to agree, once and for all, on a numerical representation for those characters - what number we use to represent 'A', what represents 'j', what is the number for space ' ' and so on.

One of the most common such schemes is called ASCII. This is just a simple table that lists 256 more-or-less useful characters including the English alphabet, the digits 0-9, symbols like '@' or '=' and so on. Most text files actually utilize a good deal less than 256 different symbols, and ASCII is really mostly used for the values between 0 and 127.

For example, in ASCII, 'A' is 65, 'B' is 66, 'C' is 67 and so on. The lower case letters start at 'a' (97) and end at 'z' (122). The digits 0-9 span the numbers 48 through 57. Space is 32. A line break

these

is actually represented by two symbols in ASCII, one called "line feed" or LF (10) and one called "carriage return" CR (13). This is a carry-over from old typewriter systems and is a well-known nuisance when dealing with text files; some systems insist on having a CR/LF combo at any line break, some don't, and hilarity ensues.

Anyway, if you have a text file and you wish to encode it in binary data, you first scan it from beginning to end, converting each character to its ASCII code. Now you have a sequence of numbers; each such number takes no more than 3 decimal digits to write down (like 122), and if you write it in base 2 instead of base 10 (which is what "binary" means) you need at most 8 digits (called "bits"). Thus every character in a text file requires 8 bits. Computer people like uniformity, so all numbers are represented using all 8 bits, even those which could be written with less. For example, CR is 13 which in binary is 1101 (eight plus four plus (skip the twos column, so zero) plus one), but when storing a text file we would store this character as 00001101. This is just like we had used 013 instead of 13 for the decimal representation. The advantage is that you don't need to have any sort of separator between numbers: every 8 bits is a number, and then comes the next one.

A short piece of text like 'Quora' becomes the sequence 81, 117, 111, 114, 97 which in binary is 0101000101110101011011110111001001100001. So here's a binary encoding of a tiny text file.

Of course, once you enlarge the scope of "text files" to cover things with a higher variety of characters, letter sizes, tables and stuff, you'll need more elaborate representation schemes. Let's stop here for now and move on to more exciting objects.

Images

Images begin as physical objects in our physical world: patterns of color and light hitting our retinae. The first order of business is to capture those patterns somehow, which is what cameras do. Older, "analog" cameras capture the light and imprint it on various kinds of film; newer, "digital" cameras employ A/D converters in the body of the camera to transform the real-life color signal into numbers.

The way this happens is, roughly, this. Imagine your field of view is divided into a fine grid of little squares.

Every tiny square on the grid has a color which is more or less uniform across the entire square. The tinier the squares, the more accurate this is. If the squares are large, you may see a shift from dark to light or from red to lighter red inside of a square, so if anyone asks you "what is the color in that square" you'd be hard pressed to give a definite answer. But if the grid is very very fine, most of the time a square will be close enough to having just one single color; in fact, if you replace the real image with one where each square has precisely that one color, a person won't be able to tell the difference.

This apple isn't really an apple: It's just an array of 256 rows and 256 columns of little squares, and each square has a specific, uniform color. Can you see the little squares? Not really, but if we used a much coarser grid, we would have gotten something like this:

This looks a lot less like an apple and a lot more like an array of squares. We call those squares "pixels", for "picture elements".

Ok. So now we have lots of pixels and each pixel has a color. We need to represent each pixel as a number (or a few numbers), and then we can store those numbers as bits just as we did before.

There are various ways of doing that. A common way uses a color scheme relying on Red, Green and Blue, and measures how much of each are in each square (this is done with color filters, following which the intensity of the light is captured with a sensor). Each color is measured on a scale of 0 to 255, say (which is 8 bits), so you get 24 bits in all for each pixel. Once you've done this, you have an array of 24-bit numbers instead of an array of squares. You arrange those numbers in sequence, add some extra information to explain how the file is structured (for instance, how many rows and how many columns it has), and that's it.

The process of converting the original image into numbers can be seen as a sequence of "sampling" or "making something discrete". ("Discrete" means that it has a definite number of possible values, instead of a continuum of infinitely many). We divided the image both horizontally and vertically into strips and pixels, and then we divided "color space" into a finite number of possible values. This process of sampling is what lies behind most analog-to-digital conversion schemes.

In practice, most image file formats employ an additional step called compression. The reasons is this: the relatively small apple image we started with has 256 x 256 = 65,536 pixels. Each such pixel needs 24 bits, so just this apple would require 1,572,864 bits. That's quite a lot, if you think about the number of photos you have on your computer or Facebook account. It therefore behooves us to find ways of using less bits per image, and this is achieved via compression. JPEG, GIF and PNG files utilize various such compression schemes. That's a whole other can of worms which we should save for a separate answer.

HISTORY OF COMPUTER DEVELOPMENT

Actually speaking electronic data processing does not go back more than just half a centaury i.e. they are in existence merely from early 1940’s. In early days when our ancestor used to reside in cave the counting was a problem. Still it is stated becoming difficult. When they started using stone to count their animals or the possession they never knew that this day will lead to a computer of today. People today started following a set of procedure to perform calculation with these stones, which later led to creation of a digital counting device, which was the predecessor the first calculating device invented, was know as ABACUS.

THE ABACUS

Abacus is known to be the first mechanical calculating device. Which was used to be performed addition and subtraction easily and speedily? This device was a first develop Ed by the Egyptians in the 10th centaury B.C, but it was given it final shape in the 12th centaury A.D. by the Chinese educationists. Abacus is made up of wooden frame in which rod where fitted across with rounds beads sliding on the rod. It id dividing into two parts called ‘Heaven’ and ‘Earth’. Heaven was the upper part and Earth was the lower one. Thus any no. can be represented by placing the beads at proper place.

NAPIER’S BONES

As the necessity demanded, scientist started inventing better calculating device. In thus process John Napier’s of Scotland invented a calculating device, in the year 1617 called the Napier Bones. In the device, Napier’s used the bone rods of the counting purpose where some no. is printed on these rods. These rods that one can do addition, subtraction, multiplication and division easily.

PASCAL’S CALCULATOR

In the year 1642, Blaise Pascal a French scientist invented an adding machine called Pascal’s calculator, which represents the position of digit with the help of gears in it.

LEIBNZ CALCULATOR

In the year 1671, a German mathematics, Gottfried Leibniz modified the Pascal calculator and he developed a machine which could perform various calculation based on multiplication and division as well.

ANALYTICAL ENGINE

In the year 1833, a scientist form England knows to be Charles Babbage invented such a machine. Which could keep our data safely? This device was called Analytical engine and it deemed the first mechanical computer. It included such feature which is used in today’s computer language. For this great invention of the computer, Sir Charles Babbage is also known as the father of the computer.

TABULATING MACHINE

The tabulating machine was an electromechanical machine designed to assist in summarizing information stored on punched cards. Invented by Herman Hollerith, the machine was developed to help process data for the 1890 U.S. Census. Later models were widely used for business applications such as accounting and inventory control. It spawned a class of machines, known as unit record equipment, and the data processing industry.

ENIAC (ELECTRONIC NUMERICAL INTEGRATOR AND COMPUTER)

ENIAC's design and construction was financed by the United States Army, Ordnance Corps, Research and Development Command, led by Major General Gladeon M. Barnes. The total cost was about $487,000, equivalent to $6,740,000 in 2016. ENIAC was designed by John Mauchly and J. Presper Eckert of the University of Pennsylvania, U.S.

ENIAC was a modular computer, composed of individual panels to perform different functions. Twenty of these modules were accumulators which could not only add and subtract, but hold a ten-digit decimal number in memory. Numbers were passed between these units across several general-purpose buses (or trays, as they were called). In order to achieve its high speed, the panels had to send and receive numbers, compute, save the answer and trigger the next operation, all without any moving parts. Key to its versatility was the ability to branch; it could trigger different operations, depending on the sign of a computed result.

ALL ROUND SCHOLARSHIP NEWS

Info

About Me

Thursday, 28 September 2017

DATA DIGITALIZATION

Text

Images

0 comments:

Post a Comment