数据挖掘与分析 - Data Mining & Analytics (SU)

发布于:2024-10-08 ⋅ 阅读:(9) ⋅ 点赞:(0)

SQL

Introduction

        The SQL is one of the most crucial skills and first step when carrying out Data Mining. So let's review SQL making you more exquisite.

        The exercises are based on the library example. The purpose of this database is to keep track of the availability of books in a small lending library and who currently has these books on loan. The library would like to be able to contact customers who have overdue books and customers would like to be able to search for books on different subjects. The library.dbdatabase is available for download if you ask in personal message. We will also consider examples from fruitandveg.db, a database describing transactions the movement of fruit and vegetables between suppliers and customers via a warehouse.

The SELECT Statement

1.WHERE:指定搜索条件

2.GROUP BY:计算一列中数据的聚合函数,按第二列的数据分组列

3.HAVING:过滤受条件限制显示的组

4.排序方式:对要显示的数据进行排序。

每个条款都是可选的,但顺序必须保持不变。

 

Where

        WHERE语句确保只显示满足指定条件的数据。按顺序要输出表格的所有列,使用∗作为速记(i.e. SELECT ∗ FROM tableName WHERE
where condition;).

        请注意,SQL中WHERE和HAVINF子句的关键区别在于WHERE子句用于在分组或聚合发生之前过滤记录,而 HAVING子句用于发生分组或聚合后过滤记录。

 

Group by

        GROUP BY语句允许我们生成一组汇总结果。例如,分组可用于生成一个表格,列出每个客户的订单数量。

        同样,以下查询只返回表格前5行的结果。

SELECT CustomerID, COUNT(Dispatch.CustomerID) AS NumOrders
FROM Dispatch
GROUP BY Dispatch.CustomerID
LIMIT 5;

Order by

        使用 `ORDER BY` 可以根据一个或多个列对查询结果进行排序。使用 `ASC` 表示按升序排列,`DESC` 表示按降序排列。以下示例展示了如何使用该语句对不同水果的记录数量进行排序,并选择仅输出苹果和香蕉的结果;然后按订单数量进行分组,并按最高订单数量优先显示结果。 

SELECT ProductDescription, COUNT(ProductID) AS NumOrders
FROM Product
GROUP BY ProductDescription
HAVING ProductDescription = 'Apples' OR ProductDescription = 'Bananas'
ORDER BY NumOrders ASC;

(一定要根据水果名分组,如果不分组的话count(水果名)输出的是全部水果,所以不能用where子句)

        可以按多列排序。例如,如果我们希望看到每个水果的所有订单,水果的类型按字母顺序排列,然后按照采摘日期的递增顺序排列:

SELECT ProductDescription, PickedDate from Product
ORDER BY ProductDescription, PickedDate;

SELECT Exercises
Write queries to output tables detailing the following.
1. List of users who either loaned or returned a book on 16 October 2014, detailing the user ID (in ascending order), the ID numbers of any books that they have loaned/returned (in ascending order), and whether the book was loaned or returned.

select user_id, book_id, trans_type from transactions
where true_date
between '2014-10-16 00:00:00' and '2014-10-16 23:59:59'
order by user_id, book_id;


2. List the ID number and a count of the number of books that the corresponding user has taken out of the library. Limit to the first 5 users to reduce the amount of output.

SELECT user_id,count(*) as totalnumber from transactions
WHERE trans_type = 'Loan'
GROUP BY user_id
ORDER BY user_id LIMIT 10;


3. A count of the number of books in the library in the subject areas of Optimisation and Operational Research.

SELECT num_copies, title FROM books
GROUP BY book_id
Having topic = 'Optimisation';


//或者

SELECT num_copies, title FROM books
WHERE topic = 'Optimisation'
GROUP BY book_id;


网站公告

今日签到

点亮在社区的每一天
去签到