CS186 SQL1 Berkeley

写在前面

Berkeley CS186 Intro to DB Systems

视频地址 课程记录

SQL 1

what is database

一些术语

Relation/Table

  • Schema: description("metadata")
  • Instance: set of data satisfying the schema

Attribute/Column/Field

Tuple/Record/Row

一些注意事项

  1. 每列的类型都是原子类型(无法再细分的),因此不会是list或者map那种类型的数据
  2. saddress 每一行都要fit the schema 简单说就是三行三列的表中不能插入四列的数据行
  3. baddress 每一列的名字不能相同
  4. maddress 数据类型只能是原子类型

SQL Language

Two sublanguages

DDL-Data Definition Language-Define and modify schema 用于定义和修改数据库架构

DML-Data Manipulation Language-Queries can be written intuitively 用于操作实际数据

The SQL DDL: Sailors

以Sailors为例子,来创建数据表

1
2
3
4
5
6
7
CREATE TABLE Sailors (
sid INTEGER,
sname CHAR(20),
rating INTEGER,
age FLOAT,
PRIMARY KEY (sid)
);
sidSNAMERATINGAGE
1Fred722
2Jim239
3Nancy827
1
2
3
4
5
6
CREATE TABLE Boats (
bid INTEGER,
bname CHAR(20),
color CHAR(10),
PRIMARY KEY (Bid)
);
BIDBNAMECOLOR
101Ninared
102Pintablue
103Santa Mariared
1
2
3
4
5
6
7
CREATE TABLE Reserves (
sid INTEGER,
bid INTEGER,
day DATE,
PRIMARY KEY (sid, bid, day),
FOREIGN KEY (sid) REFERENCES Sailors;
);
sidbidday
11029/12
21029/13
Primary Key column(s)
  • provide a unique “lookup key” for the relation 提供关系的唯一查找键 sid对于这些键值是唯一的
  • cannot have any duplicate values 不能有重复的值,主键的值在表中必须是唯一的
  • can be made up of > 1 column 主键可以由多列组成 由多列组成的话就叫做复合主键(kimi给的例子:如果一个表中存储了多个部门的员工信息,可能需要使用部门ID和员工ID的组合作为主键来确保唯一性 )
Foreign Key column(s)

Foreign Key column(s) 外键

外键应用另一个表的主键或唯一键

上述外键是Sailors表的指针,即Reservessid指向Sailorssid

  • Foreign Key references a table via the primary key of that table 外键通过另一个表的主键来引用另一个表
  • need not share the name of the referenced primary key 不一定和被引用的主键同名
1
2
3
4
5
6
7
8
CREATE TABLE Reserves (
sid INTEGER,
bid INTEGER,
day DATE,
PRIMARY KEY (sid, bid, day),
FOREIGN KEY (sid) REFERENCES Sailors,
FOREIGN KEY (bid) REFERENCES Boats;
);

The SQL DML

Basic Single-Table Queries
1
SELECT [DISTINCT] <column expression list> FROM <single table> [WHERE <predicate>]
SELECT DISTINCT
1
2
3
SELECT DISTINCT S.name, S.gpa
FROM students (AS) S
WHERE S.dept = 'CS'
  • DISTINCT specifies removal of duplicate rows before output DISTINCT 在输出前会移除重复行,如果查询结果有多行相同数据,DISTINCT 会保证这些重复行只被显示一次
  • Can refer to the students table as “S”, this is called an alias 把学生列表记作S,S是其别名,可以简化查询,提高可读性 定义别名的AS可省略
ORDER BY
1
2
3
4
SELECT S.name, S.gpa, S.age * 2 AS A2
FROM Students S
WHERE S.dept = 'CS'
ORDER BY S.gpa, S.name, A2
  • 将两倍年龄别名为A2
  • ORDER BY clause specifies output to be sorted-Lexicographic ordering 按照字典序排序输出
1
2
3
4
SELECT S.name, S.gpa, S.age * 2 AS A2
FROM Students S
WHERE S.dept = 'CS'
ORDER BY S.gpa DESC, S.name ASC, A2
  • DESC 降序 ASC 升序 默认是升序
LIMIT
1
2
3
4
5
SELECT S.name, S.gpa, S.age * 2 AS A2
FROM Students S
WHERE S.dept = 'CS'
ORDER BY S.gpa DESC, S.name ASC, a2;
LIMIT 3;
  • 限制输出为几行
Aggregates
1
2
3
SELECT [DISTINCT] AVG(S.gpa)
FROM students S
WHERE S.dept = 'CS'
  • AVG():计算平均值

  • Before producing output, compute a summary(a.k.a. an aggregate) of some arithmetic expression

  • produce 1 row of output -with one column in this case

  • other aggregates: SUM, COUNT, MAX, MIN

GROUP BY
1
2
3
SELECT [DISTINCT] AVG(S.gpa), S.dept
FROM students S
GROUP BY S.dept
  • GROUP BY:按照GROUP BY后面列出的列,进行分组

  • Partition table into groups with same GROUP BY column values - Can group by a list of columns 将表按照具有相同GROUP BY列值的组进行分区,可以根据一列或多列进行分组

  • Produce an aggregate result per group -Cardinality of output = # of distinct group values 为每个组生成一个聚合结果,输出的基数等于不同组值的数量

  • Note: can put grouping columns in SELECT list 可以将分组放入SELECT列表

HAVING
1
2
3
4
SELECT [DISTINCT] AVG(S.gpa), S.dept
FROM students S
GROUP BY S.dept
HAVING COUNT(*) > 2
  • The HAVING predicate filters groups 用于过滤分组后的结果
  • HAVING is applied after grouping and aggregation HAVING子句在分组和聚合计算完成后应用 -Hence can contain anything that could go in the SELECT list 因此可以包含选择列表中的任何内容
  • HAVING can only be used in aggregate queries HAVING只能用在聚合查询中
  • it’s an optional clause 可选非必须
Putting it all together
1
2
3
4
5
6
SELECT S.dept, AVG(S.gpa), COUNT(*)
FROM students S
WHERE S.gender = 'F'
GROUP BY S.dept
HAVING COUNT(*) >= 2
ORDER BY S.dept
DISTINCT Aggregates

DISTINCT can be used inside or outside of aggregate function, but have different meanings

1
2
3
4
5
6
7
SELECT COUNT(DISTINCT S.name)
FROM Students S
WHERE S.dept = 'CS'

SELECT DISTINCT COUNT(S.name)
FROM Students S
WHERE S.dept = 'CS'
  • 第一个:在计数之前先把所有重复项移除 相当于只计算有多少中不同项
  • 第二个:先计数,再移除重复项 相当于计算所有项
What Is This Asking For?
1
2
3
SELECT S.name, AVG(S.gpa)
FROM Students S
GROUP BY S.dept
  • illegal query 因为S.name没有被包含在GROUP BY 子句中,也没有被用在聚合函数中。数据库不知道应该选择哪个学生的姓名来显示,因为可能有多个学生具有相同的系别(S.dept)。
SQL DML: General Basic Single-Table Queries
1
2
3
4
5
6
SELECT [DISTINCT] <column expression list> FROM <single table>
[WHERE <predictable>]
[GROUP BY <column list>]
[HAVING <predicate>]
[ORDER BY <column list>]
[LIMIT <integer>]
  • Copyrights © 2024-2025 brocademaple
  • 访问人数: | 浏览次数:

      请我喝杯咖啡吧~

      支付宝
      微信