Siward's Short Introduction to Mysql Version 1.2 , 2004-10-30 Contains - basic information gathered from html docs in /usr/share/doc/mysql-doc/*.html - a lot of info from (34811 line) reference manual in info::mysql, but not full detail Formatted into shapes of a 143 line overview of what databases are, and a 1430 line tutorial on how to use them. Intended for people (like myself) that want to * learn what a database is, * learn to use a mysql database, and * have a reference to look up syntax and meaning of most frequently used queries. This document is provided to you for free under the terms of the GNU General Public License by it's copyright holder Siward de Groot (that's me). The creators of MySQL provide their database and it's documentation for free. The Debian MySQL package maintainers make it easy to install and upgrade MySQL on my computer, they also did it for free. Thanks ! :-) This document was created in october 2004 ; it pertains to MySQL version 11.16 Distrib 3.23.49, for pc-linux-gnu (i686). Any errors in this document are most likely due to me ; i welcome constructive criticism. ========================================================================================== OVERVIEW : DATABASE Databases are (usually large) collections of data stored in electronical form. For example, an airline company has a database of - flights (from where to where at which date and time, which plane and crew) - reservations (made by who, payment received, how many free places left), - salary (hours worked, other benefits, accountnumber for payment), etcetera. Data is stored in tables that each have a name, for example 'flights', and each row in that table consists of named fields, for example 'from', 'to', and each field has a value, for example 'amsterdam', 'monday', '20:30', Rows are central concept of databases, each row describes a unit of what it's table stores. Fields can contain names of other tables, for example table 'flights' could have a field telling in which table reservations for that flight are registered. Database programs (often simply called 'databases') can create a new table, add a row of data, change fields in a row, delete rows, etc. They also provide ways to enable users to connct to a database and find data in it, for example, a travel agent can connect to Alitalia's database via internet and find out which flights leave today from Roma and how many seats are available on them. Mysql is a professional database program that is used by major companys (for example Yahoo, Slashdot, NASA), and was developed out of a need for a higher performance database. It is distributed for free use under the terms of the GNU General Public License. For commercial use, installation and maintenance support are available, as is a special version that can handle extremely huge databases ; contact the Mysql team for specifications and prices, their homepage is at http://www.mysql.com . Mysql is well known in the free software community for it's ease of use with the Apache webserver. In addition to the 2 meanings of 'database' mentioned above (the data and the program), there is yet a third one, which refers to a datatype used by a database program : As mentioned above, tables can reference eachother, and a group of tables that can be used together efficiently is a database. Another advantage of having this 'database' datatype is that it makes it possible to give tables in different databases the same name. The general term 'database' can be used to refer to the data, the base, and anything else :-) . SQL There are many databases, of many companys, and needing to learn a new way to search for data for each of them is a very bad idea, so a (reasonably simple) syntax for searching and otherwise using a database has been standardized ; it is called 'structured query language' (SQL) and is described in standard 'SQL92' (standard was made in 1992). Mysql conforms to SQL92 and also provides some usefull additions to it. Searching a database usually consists of specifying a table, selecting rows from it that satisfy some criteria, and presenting fields from these rows that user is interested in ; Sql syntax reflects this structure, for example, to find flights of a certain cockpit crew, one could do 'SELECT flightnumber, from, to FROM flights WHERE pilot="Roger" AND copilot="Wilco" ;' As another example of SQL, here is how to tell it which group of tables to use by default : 'USE DATABASE databasename ;' so you better get used to this meaning of the word 'database'. FAST SEARCHING Most common operation on a database, and also one that costs most time, is searching for rows that match some criterion. Databases have special facilities for speeding this up ; a field can be declared to be a 'key' to a table when that table is created, then database program will create a hash (or somesuch) index of values of that field in all rows, so subsequent searches for values of that field will be much faster. Hashes are not created by default because they cost memory and time to create, and because usually only some fields's values are used as selection criteria in a search. TRANSACTIONS Often data in a database needs to be remotely accessible by multiple connections, in a way that keeps data correct. For example, if two travel-agents simultaneously use their internet connections to see whether places are free on a certain flight and try to book the last seat on it, both must get correct results, Therefore databases are managed by a server, ( a 'server' is a program that continually runs on a computer and can receive requests from other programs (called 'clients'), to which it responds by sending them a message, thus providing a service to them ) which makes sure data is reliably stored and changed. For cases where data-integrity needs to be maintained (as in the travel-agent example), mysql can be run in 'transaction-safe' mode ; In this mode a sequence of mysql commands is committed to the database as a whole, whithout possibility of commands from another client's connection to be executed inbetween. For example, a travel agent that arranges a flight from A to C by reserving seats on a flight from A to B and on a connecting flight from B to C, where both flights have only one not-yet-reserved seat, specifying data (flight, price, travelagent, etc.) for addition as rows to tables 'flightAB.reservations' and 'flightBC.reservations' will finish adding both rows before a commit from an other travelagent is started. (or it will fail both row-additions if an other travelagent reserved one of these seats first). For the user (the travelagent) this means that a final 'commit' command is needed (when the traveller has agreed to pay), and that this commit command could fail (even when retrieved data said a seat was available) (which is a nuisance, but is better than booking two customers for the same seat). It is also possible to use a database without special precautions ; in this case every mysql command is executed immediately ; this is called 'auto-commit' mode. Auto-commit mode is faster and uses less memory than transaction-safe mode. To use transaction-safe mode, special tabletypes must be used, which take more memory. It is not possible to use transaction-safe mode on tables that were not meant for it. It is possible to use auto-commit mode on tables that support transaction-safe mode. Mysql uses auto-commit mode by default, and it's default tabletype, MyISAM, does not support transaction-safe operation. Generally, retrieving and updating values of tables can be done as transaction, but changing layout of tables or databases can not. FRONTEND In addition to the database programs themselves, there are 'frontend' programs, for example an X interface with buttons, and scrollbars, and the name of the travelagent in golden letters. Mysql comes standard with - a shell interface that supports multiline input - librarys to make it accessible from C, C++, Perl, Python, PHP, Java, Tcl, Eiffel, and ODBC. (multithreaded apps can lock tables for read and/or write access). The documentation that comes with Mysql mainly describes the shell interface ; working via one of the programming language APIs works similar, because, after having connected to the database, programs send mysql queries as strings to the mysql server. ========================================================================================= TUTORIAL : This tutorial tries to - tell you all you need to know for writing a simple application, - be useable as a fast reference (like a cheat-sheet), and therefore presents a fair amount of detail ; If you want to start easier but slower, read the html tutorial in /usr/share/doc/mysql-doc/ . If you want more detail, the real documentation is in info under topic mysql. ----------------------------------------------------------------------------------------- Table of contents : Line Length Name 181 44 Start 225 16 Notes on syntaxdescriptions 241 52 General remarks on the SQL language 293 38 Database 331 49 Records, fields, tables, rows, and columns 380 15 Tables 395 91 Fieldtypes 486 60 Keys 546 19 Tabletypes 565 38 Syntax of 'create table' 613 10 Deleting a table 623 120 Manipulating table data 743 20 Changing table layout 763 136 Retrieving data 879 12 'As' and presenting additional data 891 211 Expression syntax 1102 74 Regular expressions and relevance matching 1176 309 Standard functions 1485 34 Using multiple tables 1519 8 Script file ----------------------------------------------------------------------------------------- START Mysql is in packages 'mysql-server', 'mysql-client', and 'mysql-doc' ; these must all be installed if you want to learn to use mysql. To use mysql, mysql server must be running ofcourse, if it is not, it can be started with 'mysqld' from commandline. I read that it may be necessary to have mysql-admin give privileges to user for use of a specific table. I did not notice this in practice. Am i mysql-admin ? Can anyone on localhost access all tables ? Are such things set in a .mysqlrc file or something like it ? When mysql daemon is running, a client application can connect to it ; from C this is done by - compiling the program with -I/usr/include/mysql -lmysqlclient -lz - #include - getting a validly initialized connectionhandle with mysql_init( MYSQL * H); - connecting with mysql_real_connect(H) from commandline simply type 'mysql', which starts a commandline interface that automatically connects to server. It is also possible to use some utility programs provided by mysql package, for example 'mysqldump', which writes contents of a table to stdout. Another aproach is to use 'mysql -e ', which executes mysql statement and writes output to stdout. Once connected to mysql (via a programlanguage api or through shell interface), sql queries can be sent to mysql server, from shell by typing them in, followed by a ; and , from C by using mysql_query( MYSQL * H, const char * query) . At end of session, connection must be terminated (in shell type 'Quit', in C 'mysql_close()' ). Mysql queries form language of interacting with mysql database server, and i describe usual queries hereafter. I indent queries by one tab, to distinguish them from normal text. On shell, a ';' must be added to each query, to indicate end-of-this-query, so that queries can be typed on multiple lines ; From API this is not necessary. These ';'s are not part of query, so they are not shown below. If from shell you forget to add them, you'll be prompted for more input. On shell, a command can be cancelled by typing '\c' . ----------------------------------------------------------------------------------------- NOTES ON SYNTAXDESCRIPTIONS In syntax descriptions, text in <>s describes what to put there, <>s should not appear in any real query. For example means to put some number there. text in []s denotes an optional item ... means optional repetition(s) of the previous item ,,, means optional repetition(s) of previous item, separated by commas. | separates alternatives ()s are used to define a group for repetition or alternatives note that ()s are also used in mysql itself. where ambiguous it will be noted. For example : [-]...[....][(E|e)[-]...] describes the syntax of a number in standard notation, like 1 or 3.1 or 1.2E-3 . ----------------------------------------------------------------------------------------- GENERAL REMARKS ON THE SQL LANGUAGE * Names and keywords Keywords in queries are not case-sensitive ; they are often written in uppercase to distinguish them from names (of tables etc.) . Names can contain all letters and digits plus _ and $ , but not the space character (most other characters have special meanings as math operator, string delimiter, etc.). Names can start with digits, but can not consist of only digits. Names consisting of digits and letters a-f can be mistaken for hex numbers, so dont use them. Names can be maximum 64 characters long. Default locale is ISO8859-1 (Latin alphabet, with the usual sort order). Some names (namely those of databases and tables) are case-sensitive on Linux, but you must - treat all names as case-insensitive (i.e. dont use different cases of same name for different things), and - stick to one caseform for referring to each named thing (i.e. dont use different cases of same name for same thing). * Whitespace Extra whitespace (spaces, tabs or newlines) is allowed everywhere, except in the middle of keywords or names ofcourse. * Comments Comments can be put in mysql language in Perl-style or C-style, C style comments are like: /* this is a comment */ and can be placed at every place whitespace can be put. They can contain newlines, but, unlike in C, not ';' or unmatched single- or double-quotes ; these would be interpreted as terminating a mysql statement or starting a string. Perl style comments are like: # this is a comment and continue to end of line. They can be placed at end of every line. Comments are usefull in scriptfiles (scriptfiles contain sequences of mysql queries that have to be executed repeatedly, so you don't need to type them every time). * Reserved Words Mysql has 112 reserved words that are not allowed as names of tables, databases, fields, etc., you can however use them by putting them in quotes (keywords are never quoted), but that is clumsy. Most of these are keywords in mysql queries. They include common words like: string, local, password, process, file, end. Learning them by heart is not a nice idea, and the only way to detect them is by mysql complaining about syntax errors. so if you get a syntax error that you don't understand in a line where you used a variablename for the first time, try quoting that name (or look in info for list of reserved words). ----------------------------------------------------------------------------------------- DATABASE A database is a set of tables that can refer to eachother. A server can (and usually does) manage multiple databases. First thing to do after having established a connection to server is specify which database you want to access, this can be an existing database or a new database. To create a new database : create database Now new database has been created, and you access it just like a pre-existing database. Databases can be deleted from server by : drop database [if exists] This is a very DANGEROUS command, as all data in database will be deleted. To see which databases exist : show databases To use a database : use database If access is denied to you for this database, then as mysqladmin do : grant all on .* to To find your mysql name : select user() which reports @ . All databases can really be used simultaneously, but in that case tablenames have to be given as ., and fields as ... 'use database' says which databasename is default. To see which database is currently default database : select database() -------------------------------------------------------------------------------------------------- RECORDS, FIELDS, TABLES, ROWS, and COLUMNS, and a note on description syntax. Data in a database is stored in form of 'records', every record consists of a set of 'fields' which have a name, a type and a value (usual types are string, integer and date). a record thus also has a type, but this is not explicitly used in mysql. Records of same type can be collected in a table, where each row of that table is a record. Mysql docs seldom speak of 'record's, usually they are called 'row's, because 'table' is central concept in mysql ; In mysql, records only exist as rows in tables. A row in a table consists of fields, and each field has a name and a type. A table that has multiple rows thus consists of columns, each column containing fields of same name and type from all rows. Mysql docs speak of a 'columnname', which is same as fieldname. I will call these 'fieldname' in rest of this text, because that is shorter than 'columnname', but will use 'column' in constructs like 'all columns', as 'all fields' means something different. On many occasions, a list of rows or columns can be supplied as an argument to a command. These are usualy formed as [,]... ie a comma-separated list in which extra whitespace is allowed. To make description more easily readable, i show such lists in form This is true for any word in <>s that is plural, unless noted otherwise. -------------------------------------------------------------------------------------------------- TABLES Next thing you need to do is determine which table(s) you want to use. This can be an existing table or a new table. To see which tables exist in current database : show tables To rename a table : rename table to To delete a whole table : truncate To create a new table : create table ( ) where a can be there are also some other fieldspecs possible (see furtheron). It is not strictly necessary to specify all fieldnames and types of table when creating a table, fields can be added to an existing table lateron. (see furtheron). At least one field must be specified at creation time however. Syntax of fieldspecs will be described lateron, basic fieldtypes are string, date, and number ; For now i just show an example : create table pet ( name varchar(20), birthdate date, numberoflegs int ) A temporary table can be created by adding keyword 'temporary' between 'create' and 'table' ; such table disappears when connection to server is closed. Before more completely describing syntax of 'create table', i need to tell you about fieldtypes, keys, and tabletypes. -------------------------------------------------------------------------------------------------- FIELDTYPES Fields have a type, which determines which kind of data you can put in it. Basic datatypes are : string, date, and number. Because small numbers can be stored in less memory than large number, and the same applies to short and long strings, several datatypes are available for each basic datatype. They are : * numeric types : tinyint 8 bit integer smallint 16 bit integer mediumint 24 bit integer int 32 bit integer bigint 64 bit integer float 24 bit float with 8 bit exponent (9 digits after decimaldot) double 48 bit float with 16 bit exponent (17 digits after decimaldot) decimal(a,b) string of digits, stored as characters a is countof chars, including sign and decimaldot, b is countof digits after decimaldot. * date and time types : these can be assigned to and used as strings and as numbers. date date in YYYY-MM-DD format, time time in HH:MM:SS format, datetime combination of date and time, in format YYYY-MM-DD HH:MM:SS timestamp datetime with time of last operation shorter formats are possible (see info) year year in 4-digit format (by default) * string types : char() string of characters must be smaller than 256 value is auto-padded with spaces or truncated to fit. bit,bool,char synonyms of char(1) varchar() variable length string, int argument specifies maximum length, maximum length can not be more than 255. trailing spaces are auto-removed. tinytext variable length string of max 255 chars text variable length string of max 65.535 chars mediumtext variable length string of max 16.777.215 chars longtext variable length string of max 4.294.967.295 chars tinyblob, blob, mediumblob, longblob same as *text, except comparisons are case-insensitive enum enumeration : a string that can only hold one of a number (max 256) of specified values. set a string that can hold any number of values from a user-specified list of (max 64) possible values. Difference between varchar and tinytext/blob is that memoryblocklets for values of text/blob fields are allocated individually. "Values of enum and set must be specified when table is created". (maybe when column is created or altered ?) Trying to in a set or enum store a value that is not in specified list, result is value "" (empty string) being stored. Empty string is a special error value that is considered to be in specified list at position 0 (first element of enum is at position 1). In numeric context, an enum evaluates to index of current value in specified list, and a set evaluates to bitmask of values in set taken as an unsigned integer, presence of first value in list sets least significant bit, etcetera. A set can be assigned to by a commaseparated list of allowed strings, the whole list must be in quotes, individual elements don't need quotes. NULL : Fields can also contain special value null, which often means 'value is not set yet'. Setting a date field to null really sets it to current date. Null values are treated specially by math and logical operators, see furtheron for that. Fields can be specified as 'not null' in a 'create table' or 'change table' statement, then: it is an error to assign null to such a field, if default value would otherwise be null, it is now something else (often 0), such fields can be used for fast searches. USE IN CREATE TABLE : Type of a field is specified in a 'create table' statement as a fieldspecifier of form [] ... where numeric fields can have options 'unsigned' and 'zerofill', char and varchar fields can have option 'binary', and other fieldtypes have no options. Unsigned fields can only contain non-negative numbers, Zerofill fields are automatically filled with 0 when table is created, Binary fields are case-sensitive in comparisons. -------------------------------------------------------------------------------------------------- KEYS Professional databases need to be able to quickly locate data. Speeding up lookups in a table can be done by creating hashvalues of all fields that are used as a thing to search for. These hashvalues cost memory, and any combination of parts of fields in a database can be used to specify desired rows, so these hashvalues are not computed by default. If you want fast searches, you need to tell mysql that a specific (set of) field(s) will be frequently used for searching on, and you do this by adding a 'key' clause to a 'create table' statement (or to a 'change table' statement, see furtheron for that). 'key' is the thing whose value is looked for when searching a table. 'index' is a synonym of 'key'. There are 5 kinds of keys : simple keys, unique keys, primary keys, and fulltext keys. Simple keys can be anything, as just described, Unique keys add requirement that data in table to search does not contain identical values of that key ; trying to add to a table a row that would have an existing value for a unique key is refused, an errormessage is printed instead. Primary keys are unique keys of which each table can have no more than one, and with additional restriction that fields used in key must be fields whose data can not be null (null values are handled differently from normal values, see furtheron for that) ; such fields must be declared 'not null' in a 'create table' or 'change table' statement. Fulltext keys are a special type of index that is good for searching (large) pieces of text ; they can only be used for fieldtypes *text and varchar, and are only available in tables of type MyISAM (which is default, see furtheron). Keys, whether fulltext or not, on fields of type blob or text are only possible when using MyISAM tables, for fulltext keys whole piece of text is used for indexing, for other keys you _must_ specify maximum length (in characters) of textpiece to use for indexing (for syntax see next paragraph). Keys must be declared, so mysql can precompute their hashvalues. A simple key is declared by adding a fieldspec of form 'key [] ([()],,,)'. where optional says that only first characters of fieldvalue need to be considered when computing hash value. A unique key is declared similarly, only 'key' is replaced by 'unique'. A primary key is declared similarly, only 'key' is replaced by 'primary key'. These fieldspecs are to be added in a 'create table' or 'alter table' statement. All fields of a primary key must be declared as 'not null', which is done by for such fields using a fieldspec of form ' not null'. Keys can also be created by 'create index' queries, but these are all aliases for 'alter table' queries (see furtheron for 'alter table'), so i do not show these. To see which fields, fieldtypes and attributes a table has : describe If you are only interested in some fields, use : describe where can contain sql wildcard characters % and _ (see furtheron). ----------------------------------------------------------------------------------------- TABLETYPES It is also possible to specify type of table to create in a 'create table' statement. Default type is MyISAM, and it is best choice for tables that don't need to be transaction-safe, ie when you don't use transactions. For tabletypes to use for transaction-safe mode, see info. For temporary tables, using HEAP tables is advantageous, as these are stored in RAM, and are thus very fast. There are some restrictions on using HEAP tables : (new words will be explained furtheron) Heap tables can not have blob, text or auto_increment columns. Heap tables can not have a key on a 'null' column. Selecting from a heap table can not have an 'order by' on a key. Do use max_rows to limit RAM usage when using HEAP tables. HEAP tables are not recommended for permanent tables because a crash will cause loss of all data of this kind of table. Dont forget to drop table when done with it, to free memory. ----------------------------------------------------------------------------------------- FULL SYNTAX OF A CREATE TABLE STATEMENT create [temporary] table [if not exists] [] [