Some notes on type systems and type theory Data types and their application in computer program languages have been the subject for extensive studies but mad little progress until the λ calculus was introduced as an instrument for these studies. Typed λ calculus became one of the most important tools for these studies and for the study of type systems. Syntax and rules are simple and the most astonishing is that such a simple syntax with so few rules allow for such profound reasoning. Even more astonishing is the fact that you can use the theories to build efficient type systems, such as the Java type system. An “in-depth” understanding of the λ -calculus and type theory is far beyond the scope of this course. DD2471 (Lecture 02) Modern database systems & their applications Spring 2012 1 / 35
Some notes on type . . . , history The ideas behind the λ calculus are not new. All started with Leibniz, who stated around 1700 that he wanted to create a universal mathematical (or symbolic) language good enough to formulate all possible problems and a method (algorithm) that could be used to formulate solutions for all the problems one could formulate with the universal language. If you stick to mathematical problems, creating the language is simple enough (and G¨ odel numbering – much later – told us that most problems can be coded as numbers and numeric computations). Set theory as formulated in first order predicate logics is quite sufficient. The algorithm was a bigger problem and formulated in λ calculus by Alonzo Church (it’s founder) and independently by Alan Turing, formulated in his “Turing machine” (the name came far later) one could prove that such an algorithm can not be designed. Consequently you had a definition of “computability” (or rather “provability”) and also a proof that some problems have no solutions (some computations can not be performed) as well as languages that are good for solving most (but not all) problems). DD2471 (Lecture 02) Modern database systems & their applications Spring 2012 2 / 35
Some notes on type . . . , history . . . Turing also showed that his notation and the λ calculus were equivalent. The λ calculus may be viewed as a language on its own and does serve as the base for a number of languages, e.g. Lisp, Scheme, Clean, ML, Miranda and Haskell. The Turing machine, on the other hand is the model for the Von Neumann machines (= computers) which all conceptually are Turing machines with random-access memory. Assembler languages are directly models of Turing machines while imperative languages all are higher order Turing machines. If you want to study extendible type systems you may start with untyped λ calculus to get the basics and then continue with typed λ calculus to understand type theory. This is just a pointer to the importance of λ calculus in data type and type systems research. DD2471 (Lecture 02) Modern database systems & their applications Spring 2012 3 / 35
For those who want to dig deeper into type systems you can take a course but not at KTH. There are courses at Chalmers in Gothenburg, at Upsala University and in Lund. or take to programming in ML, Haskell or Scheme and get into advanced problems. or read: Barendregt: ”The Lambda Calculus – Its Syntax and Semantics”, 1984 or (more easily accessed): Hindley, Seldin: ”Introduction to Combinators and λ -Calculus”, 1986 Cambridge University Press (I have a reprint from 2008) or read the not so easily accessed article by Barendregt (link under ” λ calculus (type systems)” on the course link page) It is important to emphasize its importance in the development of efficient type systems. E.g. the Java type system is built on modern λ calculus based type theoretical models. DD2471 (Lecture 02) Modern database systems & their applications Spring 2012 4 / 35
Some notes on data types in modern database applications or data types in the world of mass media, Formatted or not, free-text or XML ◮ text Vector graphics (CGM, FIG, PICT, Postscript) ◮ graphics Bit maps, photos (JPEG, MPEG, GIF) ◮ images sequences of graphics / ◮ animations sequences of photos ◮ video sound, music – mostly digital or digitalized ◮ audio ◮ combination of all these DD2471 (Lecture 02) Modern database systems & their applications Spring 2012 5 / 35
Some notes on data types . . . Common factor: all have a rich internal structure and are big . (Simple sound sequence 8 Kb, a 100 page text 500 Kb. Colour image 6-80 MB, 5 min high quality video ≈ 50 GB • It’s normal to play (play back) several different data streams simultaneously. • Tough requirements on storage media and application programs. • Download/retrieval time may be a problem • Differences in download/retrieval time is another • Synchronization may be required, time delay calculations too • Intricate semantics with complex objects. • Querying is difficult. Meta-data is important while meta-data capture is complicated. DD2471 (Lecture 02) Modern database systems & their applications Spring 2012 6 / 35
Meta-data Meta-data is data which is needed to interpret other data, thus explaining the meaning of data. It is important in information management and very important when managing complex data. In RDBMS meta-data is used to describe classes of objects (table content). With new data types you may have to use meta-data to describe each object. Thus every row in every table may need to be associated with its own meta-data. Meta-data may be used in index as well as for linguistic annotation. DD2471 (Lecture 02) Modern database systems & their applications Spring 2012 7 / 35
Meta-data . . . Type Schoolbus # Seats 54 Total Weight 20 tons Price 32 k$ Analogous data must be digitalized. Digitalized data accuracy depends on sampling frequency. m rows and n columns in an image. Each component in a 2D-array is called a pixel and contains information about colour and intensity. DD2471 (Lecture 02) Modern database systems & their applications Spring 2012 8 / 35
Meta-data . . . Data may be external to or internal in the database. Internally, data is stored as a BLOB (binary large object) or a CLOB (caracter large object). Lossless compression 1/3 of original Lossy compression 1/80 of original with not too irritating (??) quality loss DD2471 (Lecture 02) Modern database systems & their applications Spring 2012 9 / 35
Meta-data . . . JPEG (Joint Photo Expert Group) – 4 modes • sequential, left-to-right, top-to-bottom • progressive, starting with a few pixels, gradually adding pixels until whole image is shown • lossless, exact correspondence with original image • hierarchical, a variety of versions with different degree of quality loss DD2471 (Lecture 02) Modern database systems & their applications Spring 2012 10 / 35
Meta-data . . . Large amount of file formats and compression methods What kind of meta-data do we need? • structure, colour, . . . (image) • frequencies (sound) • type-face, font-size, . . . (text) • direction of motion, lighting (video) • Id for a speaker, place and time for speech • . . . DD2471 (Lecture 02) Modern database systems & their applications Spring 2012 11 / 35
Meta-data . . . You need to extract or generate meta-data Images: colour and structure Histogram for colour (RGB, CMYK, Y’UV . . . ) If not Y’UV one may have to add luminous intensity as well Meta-data may be generated: • manually, time-consuming. • Semi-automatic, where automatically generated meta-data is supplemented with manually generated meta-data. • Automatically DD2471 (Lecture 02) Modern database systems & their applications Spring 2012 12 / 35
Fetching data Operation Text Sound Image Manipulation Character Waveform Geometric String Sound editor Pixel Editing Filtering Presentation Formatting Synchronization Composition Decoding Decompression Decompression Conversion Conversion Analysis Indexing Indexing Indexing Searching Searching Searching DD2471 (Lecture 02) Modern database systems & their applications Spring 2012 13 / 35
Fetching data . . . select wine_name, price, dbms_lob.substr(note,10,20)as name, price, comment from wine_list where dbms_lob.instr (note, ’poise, elegance and balance’) <> 0 Non-standard LOB manipulation packages exist for most DBMS. Traditional data is managed as usual: select category, year, avg(price) as average_price, max(price) as highest from wine_list where region = ’Bordeaux’ group by category, year having year between 1995 and 1998 order by category, year DD2471 (Lecture 02) Modern database systems & their applications Spring 2012 14 / 35
Presentation of results • Combining result sets • Results that resemble a given example • Results that may be ordered according to some criteria On a lexical level On a syntactic level On a semantic level ◮ Presentation model ◮ Dialogue model ◮ Context ◮ ◮ Interaction model Application model ◮ Application wrapper DD2471 (Lecture 02) Modern database systems & their applications Spring 2012 15 / 35
Presentation of results . . . Example: Elmander meets the ball in the far end of the box after a corner and nets as if he had a Lacross racket for a right foot. Not a chance for the goal-keeper. Niicee! DD2471 (Lecture 02) Modern database systems & their applications Spring 2012 16 / 35
Recommend
More recommend