使用Hibernate 5.2将查询结果作为流

时间:2022-09-11 12:40:55

Since Hibernate 5.2, we are able to use the stream() method instead of scroll() if we want to fetch large amount of data.

由于Hibernate 5.2,如果我们想获取大量数据,我们可以使用stream()方法而不是scroll()方法。

However, when using scroll() with ScrollableResults we are able to a hook into the retrieval process and to free memory up by either evicting the object from the persistent context after processing it and/or clearing the entire session every now and then.

但是,当使用scroll()与ScrollableResults一起使用scroll()时,我们可以将对象从持久化上下文中删除,或者不时清除整个会话,从而将其与检索过程关联起来并释放内存。

My questions:

我的问题:

  1. Now, if we use the stream() method, what happens behind the scenes?
  2. 现在,如果我们使用stream()方法,幕后会发生什么?
  3. Is it possible to evict object from the persistent context?
  4. 是否可以将对象从持久化上下文中删除?
  5. Is the session cleared periodically?
  6. 会话是否定期清除?
  7. How is optimal memory consumption achieved?
  8. 如何实现最佳内存消耗?
  9. Is is possible to use e.g. StatelessSession?
  10. 是否可以使用例如无状态会话?
  11. Also, if we have set hibernate.jdbc.fetch_size to some number (e.g. 1000) at JPA properties, then how is this combined well with scrollable results?
  12. 另外,如果我们设置了hibernate.jdbc。在JPA属性中将fetch_size设置为某个数字(例如1000),那么如何将其与可滚动结果很好地结合起来呢?

2 个解决方案

#1


8  

The following works for me:

以下是我的作品:

DataSourceConfig.java

DataSourceConfig.java

@Bean
public LocalSessionFactoryBean sessionFactory() {
    // Link your data source to your session factory
    ...
}

@Bean("hibernateTxManager")
public HibernateTransactionManager hibernateTxManager(@Qualifier("sessionFactory") SessionFactory sessionFactory) {
    // Link your session factory to your transaction manager
    ...
}

MyServiceImpl.java

MyServiceImpl.java

@Service
@Transactional(propagation = Propagation.REQUIRES_NEW, transactionManager = "hibernateTxManager", readOnly = true)
public class MyServiceImpl implements MyService {

    @Autowired
    private MyRepo myRepo;
    ...
    Stream<MyEntity> stream = myRepo.getStream();
    // Do your streaming and CLOSE the steam afterwards
    ...

MyRepoImpl.java

MyRepoImpl.java

@Repository
@Transactional(propagation = Propagation.MANDATORY, transactionManager = "hibernateTxManager", readOnly = true)
public class MyRepoImpl implements MyRepo {

    @Autowired
    private SessionFactory sessionFactory;

    @Autowired
    private MyDataSource myDataSource;

    public Stream<MyEntity> getStream() {

        return sessionFactory.openStatelessSession(DataSourceUtils.getConnection(myDataSource))
            .createNativeQuery("my_query", MyEntity.class)
            .setReadOnly(true)
            .setFetchSize(1000)
            .stream();
    }
    ...

Just remember, when you stream you really only need to be cautious of memory at the point of object materialisation. That is truly the only part of the operation susceptible to problems in memory. In my case I chunk the stream at 1000 objects at a time, serialise them with gson and send them to a JMS broker immediately. The garbage collector does the rest.

只要记住,当你流的时候,你真的只需要在对象实现的时候小心记忆。这确实是操作中唯一容易受到内存问题影响的部分。在我的例子中,我一次对1000个对象进行块处理,然后用gson将它们序列化,然后立即将它们发送到JMS代理。垃圾收集器负责其余的工作。

It's worth noting that Spring's transactional boundary awareness closes the connection to the dB at the end without needing to be explicitly told.

值得注意的是,Spring的事务边界感知关闭了连接到数据库的连接,而无需明确告知。

#2


5  

Hibernate ORM User Guide states that

Hibernate ORM用户指南说明了这一点

Internally, the stream() behaves like a Query#scroll and the underlying result is backed by a ScrollableResults.

在内部,流()的行为类似于查询#滚动,底层结果由滚动块支持。

You can check the source code of org.hibernate.query.internal.AbstractProducedQuery to ensure that it's your duty to clear session periodically or evict object from the persistent context.

您可以检查org.hibernate.query.internal.AbstractProducedQuery的源代码,以确保定期清除会话或将对象从持久上下文中删除是您的职责。

As I understand from comments, StatelessSession is not option for you. I think the clean way to solve your case is to implement your own stream() method. It could be very similar to original method, just replace ScrollableResultsIterator with your own that would do what you need (evict object or clear session) during iteration.

我从评论中了解到,无状态会话不适合您。我认为解决您的问题的干净方法是实现您自己的stream()方法。它可能与原始方法非常相似,只需将ScrollableResultsIterator替换为自己的方法,在迭代期间执行所需的操作(驱逐对象或清除会话)。

#1


8  

The following works for me:

以下是我的作品:

DataSourceConfig.java

DataSourceConfig.java

@Bean
public LocalSessionFactoryBean sessionFactory() {
    // Link your data source to your session factory
    ...
}

@Bean("hibernateTxManager")
public HibernateTransactionManager hibernateTxManager(@Qualifier("sessionFactory") SessionFactory sessionFactory) {
    // Link your session factory to your transaction manager
    ...
}

MyServiceImpl.java

MyServiceImpl.java

@Service
@Transactional(propagation = Propagation.REQUIRES_NEW, transactionManager = "hibernateTxManager", readOnly = true)
public class MyServiceImpl implements MyService {

    @Autowired
    private MyRepo myRepo;
    ...
    Stream<MyEntity> stream = myRepo.getStream();
    // Do your streaming and CLOSE the steam afterwards
    ...

MyRepoImpl.java

MyRepoImpl.java

@Repository
@Transactional(propagation = Propagation.MANDATORY, transactionManager = "hibernateTxManager", readOnly = true)
public class MyRepoImpl implements MyRepo {

    @Autowired
    private SessionFactory sessionFactory;

    @Autowired
    private MyDataSource myDataSource;

    public Stream<MyEntity> getStream() {

        return sessionFactory.openStatelessSession(DataSourceUtils.getConnection(myDataSource))
            .createNativeQuery("my_query", MyEntity.class)
            .setReadOnly(true)
            .setFetchSize(1000)
            .stream();
    }
    ...

Just remember, when you stream you really only need to be cautious of memory at the point of object materialisation. That is truly the only part of the operation susceptible to problems in memory. In my case I chunk the stream at 1000 objects at a time, serialise them with gson and send them to a JMS broker immediately. The garbage collector does the rest.

只要记住,当你流的时候,你真的只需要在对象实现的时候小心记忆。这确实是操作中唯一容易受到内存问题影响的部分。在我的例子中,我一次对1000个对象进行块处理,然后用gson将它们序列化,然后立即将它们发送到JMS代理。垃圾收集器负责其余的工作。

It's worth noting that Spring's transactional boundary awareness closes the connection to the dB at the end without needing to be explicitly told.

值得注意的是,Spring的事务边界感知关闭了连接到数据库的连接,而无需明确告知。

#2


5  

Hibernate ORM User Guide states that

Hibernate ORM用户指南说明了这一点

Internally, the stream() behaves like a Query#scroll and the underlying result is backed by a ScrollableResults.

在内部,流()的行为类似于查询#滚动,底层结果由滚动块支持。

You can check the source code of org.hibernate.query.internal.AbstractProducedQuery to ensure that it's your duty to clear session periodically or evict object from the persistent context.

您可以检查org.hibernate.query.internal.AbstractProducedQuery的源代码,以确保定期清除会话或将对象从持久上下文中删除是您的职责。

As I understand from comments, StatelessSession is not option for you. I think the clean way to solve your case is to implement your own stream() method. It could be very similar to original method, just replace ScrollableResultsIterator with your own that would do what you need (evict object or clear session) during iteration.

我从评论中了解到,无状态会话不适合您。我认为解决您的问题的干净方法是实现您自己的stream()方法。它可能与原始方法非常相似,只需将ScrollableResultsIterator替换为自己的方法,在迭代期间执行所需的操作(驱逐对象或清除会话)。