Search This Blog

Sunday, March 13, 2011

Part 3: How Software Works

Hardware requires software to control it. Sheet music and punch cards for looms are examples of software (meaningful instructions that control those pieces of hardware).  Some of the earliest contemporary software came from a model railroad club at MIT that used telephone switches to control the track. At the beginning of the computer era, programs were either very expensive (as with custom applications for mainframe or minicomputers) or free (as with freeware distributed by programmers often with an academic bent). Personal computers were not much more than a novelty until a killer app hit the scene.
Visi-Calc for the Apple II was the first killer app because it brough an easy to understand (spreadsheet) application to consumers on a mass-market platform (Apple II) that could show a return on investment for many consumers. Visi-Calc allowed businesses and individuals to crunch their own numbers and get more timely, accurate information in the hands of decision makers. The popularity of the best software was due to the level of empowerment it gave non-technical users to harness the computing power of their PCs.
All programs must have an operating system on which to run and through the market domination of Windows and Mac OS there has been a flourishing of programs available to users.


Chapter 7: How Programming Languages Work

Various computer languages are described as high level or low level. The more it the language resembles natural human language the higher it is. Lower level languages are more difficult to work with, but usually produce programs that are smaller and faster. The lowest level is machine language which is a set of codes to communicate with the microprocessor’s internal instructions. Slightly higher than machine language is assembly, which using simple command words to supply step-by-step instructions for the processor to carry out. Assembly directly manipulates the values in the computer’s registers and thus creates compact, fast code. Higher level languages included C and Java that allow for instructions that more closely parallel English. Unlike most programming languages that are specific to a particular operating system, Java code will run on any computer regardless of operating system.
Generally, complex software consists of a master program file (or kernel) and a collection of subprograms (routines). The kernal calls the routines it needs to perform tasks.  Together they provide a means to input and output data. Information is typically stored as a variable and can be parsed as needed. Branching is used in combination with Boolean logic function to create a logical map.
Interpreted programs (DOS batch files, BASIC, Java, and macros) require an interpreter to reference the meaning of various commands in the program. If an instruction set matching a command is not found, an error message is generated. When the interpreter executes a command the command and the parameters are turned into code tokens. The interpreter keeps track of its position in the program by recording an offset, or the number of lines it is from the beginning of the file.
While an interpreter translates the source code line by line each time a program is run, a compiler translates the entire source code into an executable file that a specific type of computer can run without an interpreter. Compiling begins with a lexer that performs lexical analysis to reduce it to reserved words for which it generates a code token. Characters that don’t conform to reserved words are assumed to be variables. Variables are assigned a place in an identifier table that tracks the name and contents of every variable in the program, and are then assigneda variable token that points to the variables position in the identifier table. Strings of numeric characters are converted integers with corresponding interget tokens. This stream of token is sent to a parser that performs syntactic analysis and each token is made into a node on a syntax tree that represents the programs logical flow with each node as an operation that passes data to the node above it.  The topmost node is called a program and the nodes that pass results are routines. Next, a code generator produces segments of machine code for each node using templates and filling the blanks with values and variables found in each node. At this point the program is a string of binary numbers that consistute the machine language and values of the program. Finally, an optimizer inspects the code and eliminates redundancies to make the code smaller and faster.

Chapter 8: How Windows Works

DOS originally was designed to run only one program at a time. Windows uses system virtual machines to emulate hardware features such as ports and connects to make each program believe it is the only one running.  Each 16-bit application is given its own virtual machine, while 32-bit programs use preemptive multitasking to run concurrently. These VMs are in the processor’s user mode (as opposed to dangerous commands that are stored in the kernel mode in the base system). When any application wants to cll on Windows XP’s core services it must go through the API which acts as a middle agent between users. The API has three paired programs (one as 16-bit and a second as 32-bit) user.exe/user32.dll, krnl386.exe/kernel32.dll, gdi.exe/gdi32.dll.  The user files contain routines that applications need to control and track windows. GDI files are collections of graphic elements applications use to build their dialog boxes and send information to the screen. Kernel files work with low-level operations, managing memory, input/output operations and interrupts. In the kernel is a set of services or subsystems that contains common and necessary functions for all windows programs.
DLLs are collections of software code that perform common software functions. An application that want to use a DLL first checks with the API to find out how to call the function. All DLLs have APIs to help applications make successful function call from DLLs they hae not used before. When the DLL is called the application passes along a specification fro the type of information it expects the DLL to return. The DLL loads the specific routine into memory and the processor executes it putting the return information into memory where the application can find it and instructs windows to remove the DLL routine from memory. The application inspect the return information to determine if the DLL function executed correctly and displays an error if it has not.
Windows creates three versions of clipboard data. For document these are OEM (text), RTF (a generic formatted form), and Native (from the app that created the data). For graphics data these are Native, bitmap, and metafile. The metafile is a collection of commands that can be used by the graphic device interface (GDI) to recreate the image and is an object-oriented graphic. When pasting data from the clipboard the receiving application inspects the various formats and selects the one it understands or that retains the most formatting information.
Until Vista windows worked primarily with bitmaps which had to be rebuilt when moved or changed causing banding and pixelation. Vista graphics are built with vectors to allow for smooth resizing and movement. Vista’s Windows Presentation Foundation just passes new determinants to the graphics processing unit (GPU) to render changes quickly and dynamically.
The logical pixels in Vista come with shadings that the GPU manipulates on the pixel level which produces incredible detail. This is also responsible for the glass effect of the aero desktop.  Because each pixel is defined occluded surfaces are still calculated making for a more dynamic workspace.  Jaggies and aliasing is also significantly reduced.  Antialiasing uses color gradations to sharpen the lines of the image. Vista uses antialiasing on the subpixel level (which only works on LCD parallel subpixel lines, not CRT screens with a triangular orientationof round pixels), this is called ClearType.
Vista also has implemented User Access Control that issues both user and administrator tokens to administrators and manages access at the lowest necessary level (administrator accounts behave like standard user accounts until the user performs an administrator task). File and registry virtualization also make each user account a virtual system that helps to prevent viral attacks.

Chapter 9: How Software Applications Do Your Work

Application software is anything except the operating system. All software uses data. All data are numbers. Generally software falls into seven categories: database management, word processing, numbers, graphics programs, multimedia, communication, utilities.
Database managers let you define a datatype and its corresponding fields (attributes) then store records in table form. These can then be queried to sort and filter data, perform calculations and present data in a formatted, easy to read report. Databases can be either flat-file or relational. Computers access a database by using an offset by record length and field. Variable length field records requrie pointers.
Spreadsheets are flat-file data bases that allow users to manipulate data in cells to perform calculations. Spreadsheets process formulas by using a minicompiler that converts the function to a more efficient tokenized format. The compiler also stores the formulas in reverse Polish notation. Using pointers the spreadsheet creates a list of cells that contain formulas and save the computer time in recalculating. Each cell has a formula token, an answer, and a pointer. When a cell is changed the computer looks first at the cells containing formulas and determines which ones depend on the data that has changed and marks each that needs to be recalculated. The program then makes a second pass through the list looking only at formulas marked for recalculation. For each it determines if that formula depends on another formula that hasn’t been recalculated yet and if so it adjusts the cell’s pointers and the pointers of connected cells so the dependent formula moves to the end of the list. To calculate the formula the software feeds the data, the formula and the formula codes requested into a calculation engine that generates the answer and writes it to the memory allocated for information for that cell. The spreadsheet then moves to the next formula and repeats this process.
When a graphics program reads a bitmapped file it first looks at information in the file’s header which begins with a signature that defines the file as a bitmap, the height and width in pixels, and the palette (how many and which colors). While black and white images require only one bit per pixel, a color bitmap requires more depending on the number of colors (8-bits for 256, etc.)Each value is match in the palette to a specific combination of red, green and blue that make up a single virtual pixel.  For 24-bit graphics, 3 bytes of memory are used to define more than 16 million possible colors (True Color). In windows, 32-bit color includes an alpha channel that handles transparency and translucency for images and objects.
Some image files have built-in compression called run-length encoding (RLE) that uses a key byte that indicates whether the next byte represents several pixels or just one. If the key byte is 0 the software should apply the value to the next N pixels; if the key byte is 1 the software should apply the value to only the next pixel.  JPEG format examines the file for pixels whose absence is unlikely to be noticed. The software records bits for the reference pixel and then writes a list of the locations of every pixel that is the same color. For more drastic compression and image can be divided into 8x8 pixel blocks and average the color. If none of the pixels are too different from the average the compression changes the color of all the pixels to the average. JPEGs recompress each time the file is saved leading to banding and artifacts, or even posterization. JPEG 2000 uses wavelet technology, a calculation that transforms pixel values into numerical tokens for frequently recurring combinatins of pixels . The strings replace the cruder blocks of JPEG compression and result in smaller files without the same types of visual artifacts found in standard JPEG images.
Vector-based graphic images are stored in a file as a list that describes the mathematical properties of every object along with its location and properties such as line width and fill. It also records a hierarchy of objects. Vector graphics can scale smoothly. To display a vector image, the program reads all the formulas and their accompanying data from the display list and uses them to compute a temporary bitmapped image. This process is called viewing transform. To display fill the program calculates the locations of all pixels that make up the edge of the shape and changes the color values for all those pixels inside that edge.
Morphing converts one image into another in a number of discrete steps that are a blend of both. One common method places reference dots on the two images the more pairs the more realistic the result. The software then simultaneously repositions the transported pixels and the polygons they form to warp the emerging image so the pixels move steadily toward the positions they occupied in the picture from which they came.
Word processors store text as ASCII codes. The document header stores information about the files such as default font, margin settings, tab settings, and other data that applies to the whole document unless a change is made. When a section of text has an attribute or formatting applied the word processor can either use inline formatting (HTML) where it inserts its own code for that attribute where it’s supposted to start and signals again at the end. Alternately, MS Word uses a collection of tables to track all formatting information: section properties, paragraph properties and character properties are all recorded and using pointers to reference the text to which the formatting applies. 

No comments:

Post a Comment