Making MySQL WebScale - A Lesson in Database Normalization

MySQL is designed around doing good JOINS. Databases like looking up records from related tables; it makes them happy. As such, its a good idea to store every piece of data exactly once: then there’s only ever one place to store it.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
SELECT
C.id AS customer_id,
FNAME.value AS first_name,
LNAME.value AS last_name,
PHONE.value AS phone_number,
O.id AS order_id,
SHIP1.value AS shipping_line_one,
SHIP2.value AS shipping_line_two,
STATE.value AS shipping_state,
CITY.value AS shipping_city
ZIP.value AS shipping_zip,
OL.id AS order_line_id
ODATE.value AS order_date,
P.id AS product_id,
PNAME.value AS product_name,
QTY.value AS quantity,
PRICE.value AS price
FROM
customers C
JOIN strings FNAME ON (C.first_name_string_id = FNAME.id)
JOIN strings LNAME ON (C.last_name_string_id = LNAME.id)
JOIN integers PHONE ON (PHONE.id = C.phone_integer_id)
JOIN orders O ON (O.customer_id = C.id)
JOIN strings SHIP1 ON (SHIP1.id = O.shipping_line_one_string_id)
JOIN strings SHIP2 ON (SHIP2.id = O.shipping_line_two_string_id)
JOIN strings CITY ON (CITY.id = O.shipping_city_integer_id)
JOIN strings STATE ON (STATE.id = O.shipping_state_integer_id)
JOIN integers ZIP ON (ZIP.id = O.shipping_zip_integer_id)
JOIN order_lines OL ON (OL.order_id = O.id)
JOIN dates ODATE ON (ODATE.id = OL.order_date_date_id)
JOIN products P ON (OL.product_id = P.id)
JOIN strings PNAME ON (PNAME.id = P.name_string_id)
JOIN integers QTY ON (QTY.id = OL.order_quanitity_integer_id)
JOIN floats PRICE ON (PRICE.id = OL.price_float_id)

This is really fast because EVERY SINGLE JOIN is a Primary Key look up. WebScale aficionados know that PK searches are the fastest look up available to MySQL: its only limited by the speed of light inside the CPU cache-bus-lines.

Creating a new order is Super EASY! Let me show you:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
// Use $_POST instead of $_GET ($_POST cant be faked
// with the URL bar in the browser
$fname_id = query("SELECT id FROM strings WHERE value='$_POST[fname]'");
if (!$fname_id)
$fname_id = execute_with_last_insert_id(
"INSERT INTO strings SET value=$_POST[fname]");
}

$lname_id = query("SELECT id FROM strings WHERE value='$_POST[lname]'");
if (!$lname_id)
$lname_id = execute_with_last_insert_id(
"INSERT INTO strings SET value=$_POST[lname]");
}

$phone_id = query("SELECT id FROM integers WHERE value='$_POST[phone]'");
if (!$phone_id)
$phone_id = execute_with_last_insert_id(
"INSERT INTO integers SET value=$_POST[phone]");
}
$shipping_line_one_string_id = query("SELECT id FROM strings WHERE value='$_POST[shipping_line_1]'");
if (!$shipping_line_one_string_id)
$shipping_line_one_string_id = execute_with_last_insert_id(
"INSERT INTO strings SET value=$_POST[shipping_line_1]");
}

$shipping_line_two_string_id = query("SELECT id FROM strings WHERE value='$_POST[shipping_line_2]'");
if (!$shipping_line_two_string_id)
$shipping_line_two_string_id = execute_with_last_insert_id(
"INSERT INTO strings SET value=$_POST[shipping_line_2]");
}
$shipping_state_string_id = query("SELECT id FROM strings WHERE value='$_POST[shipping_state]'");
if (!$shipping_state_string_id)
$shipping_line_state_string_id = execute_with_last_insert_id(
"INSERT INTO strings SET value=$_POST[shipping_state]");
}
$shipping_line_city_string_id = query("SELECT id FROM strings WHERE value='$_POST[shipping_city]'");
if (!$shipping_city_string_id)
$shipping_city_string_id = execute_with_last_insert_id(
"INSERT INTO strings SET value=$_POST[shipping_city]");
}

$shipping_zip_integer_id = query("SELECT id FROM integers WHERE value='$_POST[shipping_zip]'");
if (!$shipping_zip_integer_id)
$shipping_zip_integer_id = execute_with_last_insert_id(
"INSERT INTO integers SET value=$_POST[shipping_zip]");
}
$order_date_date_id = query("SELECT id FROM dates WHERE value='$_POST[order_date]'");
if (!$order_date_date_id)
$order_date_date_id = execute_with_last_insert_id(
"INSERT INTO dates SET value=$_POST[order_date]");
}

$quantity_integer_id = query("SELECT id FROM integers WHERE value='$_POST[qty]'");
if (!$quantity_integer_id)
$quantity_integer_id = execute_with_last_insert_id(
"INSERT INTO integers SET value=$_POST[qty]");
}

$price_float_id = query("SELECT id FROM floats WHERE value='$_POST[price]'");
if (!$price_float_id)
$price_float_id = execute_with_last_insert_id(
"INSERT INTO floats SET value=$_POST[price]");
}
// finally,
$customer_id = execute_with_last_insert_id(
"INSERT INTO customers SET
fname_string_id = $fname_id,
lname_string_id = $lname_id,
phone_integer_id = $phone_integer_id");

$order_id = execute_with_last_insert_id(
"INSERT INTO orders SET
customer_id = $customer_id,
shipping_line_one_string_id = $shipping_one_string_id,
shipping_line_two_string_id = $shipping_two_string_id,
city_string_id = $shipping_city_string_id,
state_string_id = $shipping_state_string_id,
shipping_zip_integer_id = $shipping_zip_integer_id");

execute_with_last_insert_id(
"INSERT INTO order_lines SET
order_id = $order_id,
product_id = $product_id,
quantity_integer_id = $quantity_integer_id
price_float_id = $price_float_id");

Whoever said PHP and MySQL couldn’t scale????