解决 Hibernate N+1 问题

发表于 2019-03-10 更新于 2022-02-24 分类于 JPA 阅读次数：

问题

作为一个例子，我将使用在线图书订购应用程序的简化版本。在这样的应用程序中，我可能会创建一个如下所示的实体来代表采购订单：

@Entity
public class PurchaseOrder {

    @Id
    private String id;
    private String customerId;

    @OneToMany(cascade = ALL, fetch = EAGER)
    @JoinColumn(name = "purchase_order_id")
    private List<PurchaseOrderItem> purchaseOrderItems = new ArrayList<>();
}

采购订单包括订单 ID，客户 ID 以及正在购买的一个或多个商品。 PurchaseOrderItem 实体可能具有以下结构 -

@Entity
public class PurchaseOrderItem {

    @Id
    private String id;

    private String bookId;
}

现在假设我们需要查找客户的订单以在其采购订单历史记录中显示它们。以下查询将用于此目的 -

SELECT
    P
FROM
    PurchaseOrder P
WHERE
    P.customerId = :customerId

转换为 SQL 时看起来如下所示 -

select
    purchaseor0_.id as id1_1_,
    purchaseor0_.customer_id as customer2_1_ 
from
    purchase_order purchaseor0_ 
where
    purchaseor0_.customer_id = ?

这一个查询将返回客户拥有的所有采购订单。但是，为了获取订单商品，在我们遍历订单商品时，JPA 将为每个订单发出单独的查询。例如，如果客户有 5 个订单，那么 JPA 将发出 5 个额外的查询来获取这些订单中包含的订单商品。这就是 N + 1 问题 - 每个订单的查询导致 N 个订单商品的查询。

抓取

在理解问题和探索解决方法之前，我们先理解下 JPA 的抓取操作。

抓取本质上讲是从数据库中获取数据并将其提供给应用程序的过程。调整应用程序的抓取方式将决定应用程序如何执行，就宽度（值/列）和/或深度（结果/行）而言，获取太多数据会增加 JDBC 通信和 ResultSet 处理方面的不必要开销。抓取的数据太少可能会导致需要进行额外的抓取。调整应用程序获取数据的方式可以影响整体应用程序性能。

抓取的概念主要分为两个问题：

什么时候数据需要被抓取？立即还是延迟？
数据以何种方式被抓取？

hibernate 在不同的范围定义了抓取，通过 hibernate 的 Fetch 注解我们可以定义如何抓取数据，通过 javax.persistence 中 @Baisc、@ManyToOne、@OneToMany 等注解上的 fetch 属性 javax.persistence.FetchType 定义何时抓取数据。

静态（编译时）

我们可以在定义映射时静态定义抓取策略。

SELECT

执行单独的 SQL 查询以加载数据。可以是 EAGER（立即发出第二个查询）或 LAZY（延迟第二个查询直到需要数据）。这就是通常所说的 N + 1 策略。

JOIN

通过使用 SQL 外部联接 (OUTER JOIN) 抓取要获取的数据。使用该类型时数据以 EAGER 形式加载，即使指明了 LAZY。

当使用主键查询单个实体时，该注解正常工作。例如 Spring Data JPA 中的 findOne、findById，当使用 findAll 时，会继续使用 N+1 查询。

Join 查询会产生重复的结果，需要使用 Set 存储结果集等其它方式去重。

BATCH

根据批次大小，使用 IN 限制作为 SQL WHERE 子句的一部分，执行单独的 SQL 查询以加载相关数据项。同样可以和 EAGER 或 LAZY 一起使用。

SUBSELECT

根据集合所有者构建单独的 SQL 查询加载关联的数据。同样可以和 EAGER 或 LAZY 一起使用。

子查询会将所有关联的数据全部加载到内存中，例如用在 @OneToMany 上时，如果关联数据不多时可以使用。

动态（运行时）

动态提供了更加灵活的运行时抓取策略定义方式。

fetch profiles

在映射中定义，但可以在会话上启用/禁用。

HQL/JPQL

Hibernate 和 JPA Criteria 查询都可以指定特定于该查询的抓取定义。

entity graphs

从 Hibernate 4.2（JPA 2.1）开始，实体图为我们提供了更加详细的抓取方案定义。

了解了抓取之后，我们就可以找到解决问题的方法。

解决方法

避免使用立即抓取（Eager Fetching）

这是问题背后的主要原因。我们应该从我们的映射中摆脱所有立即抓取。它们几乎没有任何好处可以证明它们在生产级应用中的使用。我们应该将所有关系标记为懒惰。

只抓取你真实需要的数据

有时候我们并不想在查询订单时关联出所有的订单记录，我们可以将订单记录设为懒加载，在自己真实需要时再去查询对应的数据。

在 JPQL 中使用 Fetch Join

初始化延迟关联的更好选择是使用带有抓取连接的 JPQL 查询。

1
2
3

Query q = this.em.createQuery("SELECT o FROM Order o JOIN FETCH o.items i WHERE o.id = :id");
q.setParameter("id", orderId);
newOrder = (Order) q.getSingleResult();

这告诉实体管理器在同一查询中加载所选实体和关系。

使用 BatchSize 批量抓取

批量抓取是惰性选择抓取策略的优化。假设该订单的商品条目有 25 个，当配置了 BatchSize 后，在请求订单时，查询将变为 3 条，每条语句使用 In 查询 5 个商品条目。

使用 @BatchSize 注解可以配置到懒加载的集合或对象上。

@Entity
@BatchSize(size=100)
class PurchaseOrderItem {
...
}

@OneToMany
@BatchSize(size = 5) /
private List<PurchaseOrderItem> purchaseOrderItems() = { ... };

实体图（Entity Graph）

实体图是特定化查询或操作的模板。它们在创建**抓取方案（fetch plans）**时使用。应用程序开发人员使用抓取方案将相关的持久字段组合在一起以提高运行时性能。

默认情况下，实体字段或属性是**懒抓取（lazy fetch）**的。开发人员将字段或属性指定为抓取方案的一部分，持久性 provider 将立即抓取（eager fetch）它们。

我们可以使用注解或部署描述符(比如 web.xml)静态创建实体图，也可以使用标准接口动态创建实体图。

实体图定义了在查找或查询操作期间需要立即抓取的字段。

默认，实体的所有字段都是懒抓取，除非指定了实体元数据的 fetch 属性为 javax.persistence.FetchType.EAGER。但是主键和版本字段始终是立即抓取的，不需要将其显式添加到实体图中。

创建的实体图可以是 fetch graph(抓取图) 或 load graph(加载图)。

Fetch Graphs（抓取图）

当 javax.persistence.fetchgraph 属性用于指定实体图时，实体图的属性节点指定的属性将被视为 FetchType.EAGER，未指定的属性将被视为 FetchType.LAZY。以下规则适用，具体取决于属性类型。

Load Graphs（加载图）

当 javax.persistence.loadgraph 属性用于指定实体图时，实体图的属性节点指定的属性将被视为 FetchType.EAGER，未指定的属性将根据其指定的或默认的FetchType 进行处理。

命名实体图（Named Entity Graph）

命名实体图是由应用于实体类的 @NamedEntityGraph 注解定义的实体图，或应用程序部署描述符中的 named-entity-graph 元素。部署描述符中定义的命名实体图覆盖任何具有相同名称的基于注解的实体图。

通过使用 javax.persistence.NamedAttributeNode 注解在 @NamedEntityGraph 的 attributeNodes 元素中指定字段，将字段添加到实体图中：

@NamedEntityGraph(name="emailEntityGraph", attributeNodes={
    @NamedAttributeNode("subject"),
    @NamedAttributeNode("sender")
})
@Entity
public class EmailMessage {
    @Id
    String messageId;
    String subject;
    String body;
    String sender;
}

通过在 @NamedEntityGraphs 注解中对多个 @NamedEntityGraph 定义进行分组，可以将多个 @NamedEntityGraph 定义应用于类。

@NamedEntityGraphs({
    @NamedEntityGraph(name="previewEmailEntityGraph", attributeNodes={
        @NamedAttributeNode("subject"),
        @NamedAttributeNode("sender"),
        @NamedAttributeNode("body")
    }),
    @NamedEntityGraph(name="fullEmailEntityGraph", attributeNodes={
        @NamedAttributeNode("sender"),
        @NamedAttributeNode("subject"),
        @NamedAttributeNode("body"),
        @NamedAttributeNode("attachments")
    })
})
@Entity
public class EmailMessage { ... }

通过为命名实体图调用 EntityManager.getEntityGraph 来获取定义的命名实体图。

1	EntityGraph<EmailMessage> eg = em.getEntityGraph("emailEntityGraph");

在查询操作中使用 Entity Graphs

要为有类型和无类型查询指定实体图，请在查询对象上调用 setHint 方法，并指定 javax.persistence.loadgraph 或 javax.persistence.fetchgraph 作为属性名称，并将 EntityGraph 实例指定为值：

EntityGraph<EmailMessage> eg = em.getEntityGraph("previewEmailEntityGraph");
List<EmailMessage> messages = em.createNamedQuery("findAllEmailMessages")
        .setParameter("mailbox", "inbox")
        .setHint("javax.persistence.loadgraph", eg)
        .getResultList();

有类型的查询使用相同的技术：

EntityGraph<EmailMessage> eg = em.getEntityGraph("previewEmailEntityGraph");

CriteriaQuery<EmailMessage> cq = cb.createQuery(EmailMessage.class);
Root<EmailMessage> message = cq.from(EmailMessage.class);
TypedQuery<EmailMessage> q = em.createQuery(cq);
q.setHint("javax.persistence.loadgraph", eg);
List<EmailMessage> messages = q.getResultList();

动态实体图（Dynamic Entity Graph）

创建动态实体图可以使用：EntityManager.createEntityGraph

动态实体图类似于命名的 entity graph。唯一的区别是，entity graph 是通过 Java API 定义的。

EntityGraph graph = this.em.createEntityGraph(Order.class);
Subgraph itemGraph = graph.addSubgraph("items");
    
Map hints = new HashMap();
hints.put("javax.persistence.loadgraph", graph);
  
Order order = this.em.find(Order.class, orderId, hints);

使用代码动态创建 entity graph 允许我们可以不使用实体上的注解。因此，如果您需要创建一个不会重复使用的特定于用例的图表，我建议使用动态实体图。如果要重用实体图，则更容易注释命名实体图。

Spring Data JPA 中使用 Entity Graph

在 Spring Data JPA 中，我们可以通过在查询接口方法上使用注解 org.springframework.data.jpa.repository.EntityGraph 来定义命名实体图或动态实体图：

通过指定 value 属性指定命名实体图
通过指定 attributePaths 属性动态定义实体图

该属性为数组类型，我们可以定义多个 attribute，也可以通过 property.nestedProperty 形式来定义实体对象字段嵌套的属性
1
2
@EntityGraph(attributePaths = {"questions", "questions.questionOptions", "questions.answers"})
Optional<Questionnaire> findOneByProject_Id(Long id);

存在问题

从理论上讲，抓取类型为 fetch 时，只有指定的属性会被立即加载，但是在 Hibernate 5.4.11 之前，它还会读取实体映射属性上定义的抓取时机为 Eager 的属性，该问题（HHH-8776）已被修复。对应于 spring-boot-starter-data-jpa 版本 2.2.5.RELEASE+ 都可以正常使用。
JPA 图抓取规范不适用于 Hibernate 中的 basic（@Basic）属性。 换句话说，默认情况下，@Basic 这些属性依赖于默认的提取策略。默认的提取策略为 FetchType.EAGER。即使未明确指定这些属性，在使用实体图抓取的情况下也会加载这些属性。除非我们开启了字节码增强功能，基本属性使用 @Basic(fetch = FetchType.LAZY) 才会起作用。
Spring Data JPA 暂只支持在注解中定义抓取属性，不能运行时动态指定。我们可以借助 https://github.com/Cosium/spring-data-jpa-entity-graph 该库弥补这个问题，或者自己拓展一个基础仓库类。

如下代码所示：

@NoRepositoryBean
public interface BaseRepository<T, ID> extends JpaRepository<T, ID> {
		Page<T> findAll(Predicate predicate, Pageable pageable, JpaEntityGraph jpaEntityGraph);

    Iterable<T> findAll(Predicate predicate, JpaEntityGraph jpaEntityGraph);
}

@Transactional(readOnly = true)
public class BaseRepositoryImpl<T, ID extends Serializable> extends SimpleJpaRepository<T, ID> implements BaseRepository<T, ID> {
    private final Logger log = LoggerFactory.getLogger(BaseRepositoryImpl.class);

    private final EntityManager entityManager;
    private final JpaEntityInformation<T, ID> entityInformation;
    private EntityPath<T> path;
    private Querydsl querydsl;
    private final JPAQueryFactory jpaQueryFactory;

    public BaseRepositoryImpl(JpaEntityInformation<T, ID> entityInformation, EntityManager entityManager) {
        super(entityInformation, entityManager);

        this.entityManager = entityManager;
        this.entityInformation = entityInformation;
        this.jpaQueryFactory = new JPAQueryFactory(entityManager);
        try {
            this.path = SimpleEntityPathResolver.INSTANCE.createPath(entityInformation.getJavaType());
            this.querydsl = new Querydsl(entityManager, new PathBuilder<T>(path.getType(), path.getMetadata()));
        } catch (Exception e) {
            log.error("{} 未找到 Q 生成文件，请检查代码", entityInformation.getJavaType().getSimpleName());
        }
    }
  
    @Override
    public Page<T> findAll(Predicate predicate, Pageable pageable, JpaEntityGraph jpaEntityGraph) {
        Assert.notNull(pageable, "Pageable must not be null!");

        // 避免内存分页，修复 HHH000104: firstResult/maxResults specified with collection fetch; applying in memory! 警告
        // 解决方法如该文章描述分两步查：https://vladmihalcea.com/fix-hibernate-hhh000104-entity-fetch-pagination-warning-message/
        SingularAttribute<? super T, ?> idAttribute = entityInformation.getIdAttribute();

        assert idAttribute != null;
        JPQLQuery<T> query = querydsl.applyPagination(pageable, doCreateQuery(null, predicate).select(ExpressionUtils.path(getDomainClass(), path, idAttribute.getName())));

        final QueryResults<T> results = query.fetchResults();

        Predicate pagePredicate = ExpressionUtils.in(ExpressionUtils.path(this.path.getClass(), idAttribute.getName()), results.getResults());

        final AbstractJPAQuery<Object, JPAQuery<Object>> domainContentQuery = doCreateQuery(jpaEntityGraph, pagePredicate);
        for (Sort.Order o : pageable.getSort()) {
            domainContentQuery.orderBy(new OrderSpecifier<>(o.isAscending() ? Order.ASC : Order.DESC, Expressions.stringPath(o.getProperty())));
        }
        return PageableExecutionUtils.getPage(domainContentQuery.select(path).fetch(), pageable, results::getTotal);
    }


    @Override
    public Iterable<T> findAll(Predicate predicate, JpaEntityGraph jpaEntityGraph) {
        return doCreateQuery(jpaEntityGraph, predicate).select(path).fetch();
    }

		/**
     * Creates a new {@link JPQLQuery} count query for the given {@link Predicate}.
     *
     * @param predicate, can be {@literal null}.
     * @return the Querydsl count {@link JPQLQuery}.
     */
    protected JPQLQuery<?> createCountQuery(@Nullable Predicate... predicate) {
        return doCreateQuery(null, predicate);
    }

    private AbstractJPAQuery<?, ?> doCreateQuery(JpaEntityGraph jpaEntityGraph, @Nullable Predicate... predicate) {

        AbstractJPAQuery<?, ?> query = querydsl.createQuery(path);

        if (predicate != null) {
            query = query.where(predicate);
        }

        Map<String, Object> hints = new HashMap<>(Jpa21Utils.tryGetFetchGraphHints(entityManager, jpaEntityGraph, entityInformation.getJavaType()));

        for (Map.Entry<String, Object> hint : hints.entrySet()) {
            query.setHint(hint.getKey(), hint.getValue());
        }

        return query;
    }