题目来源:快手
一、题目描述
快手平台需优化好友推荐算法,需从用户关注行为中筛选出双向关注的用户对(即互相关注关系),用于分析高互动用户群体的行为特征及构建社交图谱。
样例数据
假设关注关系表 follow
包含以下数据:
from_user | to_user |
---|---|
A | B |
B | A |
A | C |
C | A |
B | C |
二、实现步骤
步骤1:筛选双向关注用户对
作用:通过自连接,筛选出互相关注的用户对。
SQL逻辑:
select
t1.from_user as user_a,
t1.to_user as user_b
from
follow t1
join
follow t2
on
t1.from_user = t2.to_user
and t1.to_user = t2.from_user;
执行结果:
user_a | user_b |
---|---|
A | B |
B | A |
A | C |
C | A |
步骤2:去重处理
作用:确保每对用户只输出一行,避免重复。
SQL逻辑:
select
case when t1.from_user < t1.to_user then t1.from_user else t1.to_user end as user_a,
case when t1.from_user < t1.to_user then t1.to_user else t1.from_user end as user_b
from
follow t1
join
follow t2
on
t1.from_user = t2.to_user
and t1.to_user = t2.from_user
where
t1.from_user < t1.to_user;
执行结果:
user_a | user_b |
---|---|
A | B |
A | C |
最终SQL实现
select
case when t1.from_user < t1.to_user then t1.from_user else t1.to_user end as user_a,
case when t1.from_user < t1.to_user then t1.to_user else t1.from_user end as user_b
from
follow t1
join
follow t2
on
t1.from_user = t2.to_user
and t1.to_user = t2.from_user
where
t1.from_user < t1.to_user;
三、其他方法
select
least(user_a, user_b) as user_id,
greatest(user_a, user_b) as friend_id
from follow
group by least(user_a, user_b), greatest(user_a, user_b)
having count(*) >= 2;