关于fence。不错的參考文章http://blog.csdn.net/jinzhuojun/article/details/39698317。本文结合代码分析下自己理解的fence的产生和传递。
为何须要fence
首先,fence的产生和GPU有非常大的关系,以下是wiki上GPU的介绍。
A graphics processing unit (GPU), also occasionally called visual processing unit (VPU), is a specialized electronic circuit designed to rapidly manipulate and alter memory to accelerate the creation of images in a frame buffer intended for output to a display. GPUs are used in embedded systems, mobile phones, personal computers, workstations, and game consoles. Modern GPUs are very efficient at manipulating computer graphics and image processing, and their highly parallel structure makes them more effective than general-purpose CPUs for algorithms where the processing of large blocks of visual data is done in parallel.
GPU的产生就是为了加速图形显示到display的过程。
如今广泛使用在嵌入式设备,手机。pc,server,游戏机等上。
GPU在图形处理上非常高效,此外它的并行架构使得在处理大规模的并行数据上性能远超CPU,曾经上学的时候做过CUDA相关的东西,印象非常深,对于并行数据处理能力提升了起码10倍。
而CPU和GPU两个硬件是异步的,当使用opengl时,首先在CPU上调用一系列gl命令,然后这些命令去GPU运行真正的画图过程,画图何时结束,CPU根本不知道。当然能够让CPU堵塞等待GPU画图完毕,然后再去处理兴许工作,可是这样效率就太低了。
以下的样例非常形象,说明了fence在GPU和CPU之间协调工作,fence让GPU和CPU并行运行,提高了图像显示的速度。
For example, an application may queue up work to be carried out in the GPU. The GPU then starts drawing that image. Although the image hasn’t been drawn into memory yet, the buffer pointer can still be passed to the window compositor along with a fence that indicates when the GPU work will be finished. The window compositor may then start processing ahead of time and hand off the work to the display controller. In this manner, the CPU work can be done ahead of time. Once the GPU finishes, the display controller can immediately display the image.
fence怎样使用
一般fence的用法例如以下,
//首先创建一个EGLSyncKHR 同步对象
EGLSyncKHR sync = eglCreateSyncKHR(dpy,
EGL_SYNC_NATIVE_FENCE_ANDROID, NULL);
if (sync == EGL_NO_SYNC_KHR) {
ST_LOGE("syncForReleaseLocked: error creating EGL fence: %#x",
eglGetError());
return UNKNOWN_ERROR;
}
//将opengl cmd 缓冲队列中的cmd所有flush去运行,而不用去等cmd缓冲区满了再运行
glFlush();
//将同步对象sync转换为fencefd
int fenceFd = eglDupNativeFenceFDANDROID(dpy, sync);
eglDestroySyncKHR(dpy, sync);
if (fenceFd == EGL_NO_NATIVE_FENCE_FD_ANDROID) {
ST_LOGE("syncForReleaseLocked: error dup'ing native fence "
"fd: %#x", eglGetError());
return UNKNOWN_ERROR;
}
//利用fencefd,新建一个fence对象
sp<Fence> fence(new Fence(fenceFd));
//将新创建的fence和老的fence merge
status_t err = addReleaseFenceLocked(mCurrentTexture,
mCurrentTextureBuf, fence);
当中,addReleaseFenceLocked为
status_t ConsumerBase::addReleaseFenceLocked(int slot,
const sp<GraphicBuffer> graphicBuffer, const sp<Fence>& fence) {
CB_LOGV("addReleaseFenceLocked: slot=%d", slot);
// If consumer no longer tracks this graphicBuffer, we can safely
// drop this fence, as it will never be received by the producer.
if (!stillTracking(slot, graphicBuffer)) {
return OK;
}
//老的fence为null,直接赋值
if (!mSlots[slot].mFence.get()) {
mSlots[slot].mFence = fence;
} else {
//否则运行merge
sp<Fence> mergedFence = Fence::merge(
String8::format("%.28s:%d", mName.string(), slot),
mSlots[slot].mFence, fence);
if (!mergedFence.get()) {
CB_LOGE("failed to merge release fences");
// synchronization is broken, the best we can do is hope fences
// signal in order so the new fence will act like a union
mSlots[slot].mFence = fence;
return BAD_VALUE;
}
mSlots[slot].mFence = mergedFence;
}
return OK;
}
关于Fence对象。仅仅有当mFenceFd不等于-1的时候才是有效的fence,即能够起到“拦截”作用,让CPU和GPU进行同步。
//NO_FENCE相应的mFenceFd为-1
const sp<Fence> Fence::NO_FENCE = sp<Fence>(new Fence);
Fence::Fence() :
mFenceFd(-1) {
}
Fence::Fence(int fenceFd) :
mFenceFd(fenceFd) {
}
而Fence这个类。由于实现了Flattenable协议,所以能够利用binder传递。
Most recent Android devices support the “sync framework”. This allows the system to do some nifty thing when combined with hardware components that can manipulate graphics data asynchronously. For example, a producer can submit a series of OpenGL ES drawing commands and then enqueue the output buffer before rendering completes. The buffer is accompanied by a fence that signals when the contents are ready. A second fence accompanies the buffer when it is returned to the free list, so that the consumer can release the buffer while the contents are still in use. This approach improves latency and throughput as the buffers move through the system.
上面这段话结合BufferQueue的生产者和消费者模式更easy理解,描写叙述了fence怎样提升graphic的显示性能。生产者利用opengl画图,不用等画图完毕。直接queue buffer。在queue buffer的同一时候,须要传递给BufferQueue一个fence,而消费者acquire这个buffer后同一时候也会获取到这个fence。这个fence在GPU画图完毕后signal。这就是所谓的“acquireFence”,用于生产者通知消费者生产已完毕。
当消费者对acquire到的buffer做完自己要做的事情后(比如把buffer交给surfaceflinger去合成)。就要把buffer release到BufferQueue的free list,由于该buffer的内容可能正在被surfaceflinger使用,所以release时也须要传递一个fence。用来指示该buffer的内容是否依旧在被使用,接下来生产者在继续dequeue buffer时,假设dequeue到了这个buffer。在使用前先要等待该fence signal。这就是所谓的“releaseFence”,后者用于消费者通知生产者消费已完毕。
一般来说,fence对象(new Fence)在一个BufferQueue相应的生产者和消费者之间通过binder传递,不会在不同的BufferQueue中传递(可是对利用overlay合成的layer,其所相应的acquire fence。会被传递到HWComposer中,由于overlay直接会由hal层的hwcomposer去合成,其使用的graphic buffer是上层surface中render的buffer。假设上层surface使用opengl合成,那么在hwcomposer对overlay合成前先要保证render完毕(画图完毕),即在hwcomposer中等待这个fence触发,所以fence须要首先被传递到hal层。可是这个fence的传递不是通过BufferQueue的binder传递。而是利用详细函数去实现,兴许有分析)。
由于opengl的实现分为软件和硬件,所以以下结合代码分别分析。
软件实现的opengl
opengl的软件实现。也就是agl,尽管4.4上已经舍弃了。可是在一个项目中由于没有GPU。overlay,所以仅仅能使用agl去进行layer的合成。agl的eglCreateSyncKHR函数例如以下。当中的凝视写的非常清晰。agl是同步的,由于不牵扯GPU。所以通过agl创建的fence的mFenceFd都是-1。
EGLSyncKHR eglCreateSyncKHR(EGLDisplay dpy, EGLenum type,
const EGLint *attrib_list)
{
if (egl_display_t::is_valid(dpy) == EGL_FALSE) {
return setError(EGL_BAD_DISPLAY, EGL_NO_SYNC_KHR);
}
if (type != EGL_SYNC_FENCE_KHR ||
(attrib_list != NULL && attrib_list[0] != EGL_NONE)) {
return setError(EGL_BAD_ATTRIBUTE, EGL_NO_SYNC_KHR);
}
if (eglGetCurrentContext() == EGL_NO_CONTEXT) {
return setError(EGL_BAD_MATCH, EGL_NO_SYNC_KHR);
}
// AGL is synchronous; nothing to do here.
// agl是同步的
return FENCE_SYNC_HANDLE;
}
硬件实现的opengl
当有GPU时,opengl使用硬件实现,这时候须要有fence的同步,上层考虑两种情况。下层使用opengl和overlay合成:
a, 上层使用canvas画图;
b, 上层使用opengl画图。
在http://blog.csdn.net/lewif/article/details/50946236中已经介绍了opengl函数结合egl的用法,以下对上层两种情况展开分析。
上层使用canvas画图
在上层使用canvas画图时。先要使用lockCanvas去获取一个Canvas。
/*--------------surface.java---------------------------*/
public Canvas lockCanvas(Rect inOutDirty)
throws Surface.OutOfResourcesException, IllegalArgumentException {
//这里的mNativeObject就是native层的surface对象相应的指针
mLockedObject = nativeLockCanvas(mNativeObject, mCanvas, inOutDirty);
return mCanvas;
}
/*--------------android_view_Surface.cpp---------------------------*/
static jint nativeLockCanvas(JNIEnv* env, jclass clazz,
jint nativeObject, jobject canvasObj, jobject dirtyRectObj) {
//利用nativeObject指针还原surface对象
sp<Surface> surface(reinterpret_cast<Surface *>(nativeObject));
ANativeWindow_Buffer outBuffer;
status_t err = surface->lock(&outBuffer, dirtyRectPtr);
}
继续surface->lock,在lock函数中会去从BufferQueue中dequeue Buffer。当中在dequeue到后,先要去等待release fence触发。
/*--------------Surface.cpp---------------------------*/
status_t Surface::lock(
ANativeWindow_Buffer* outBuffer, ARect* inOutDirtyBounds)
{
ANativeWindowBuffer* out;
int fenceFd = -1;
//使用dequeueBuffer从BufferQueue中获取buffer,返回fencefd
//①该buffer slot假设是第一次dequeue到,相应的是NO_FENCE,fencefd为-1
//②可是,假设该buffer slot被使用过。release过,这时候buffer slot就伴随个
// release fence用来指示该buffer的内容是否依旧被使用。这时候dequeueBuffer返回的就是
// 相应的release fence,经过fence->dup(),fenceFd肯定不为-1了
status_t err = dequeueBuffer(&out, &fenceFd);
if (err == NO_ERROR) {
sp<GraphicBuffer> backBuffer(GraphicBuffer::getSelf(out));
//通过上面的release fencefd新建fence
sp<Fence> fence(new Fence(fenceFd));
//dequeue后须要等待该buffer的内容不再使用了。才干去画图
//假设是第一次dequeue到,fenceFd为-1,直接返回,不堵塞等待
err = fence->waitForever("Surface::lock");
if (err != OK) {
ALOGE("Fence::wait failed (%s)", strerror(-err));
cancelBuffer(out, fenceFd);
return err;
}
return err;
}
int Surface::dequeueBuffer(android_native_buffer_t** buffer, int* fenceFd) {
sp<Fence> fence;
status_t result = mGraphicBufferProducer->dequeueBuffer(&buf, &fence, mSwapIntervalZero,
reqW, reqH, mReqFormat, mReqUsage);
sp<GraphicBuffer>& gbuf(mSlots[buf].buffer);
if ((result & IGraphicBufferProducer::BUFFER_NEEDS_REALLOCATION) || gbuf == 0) {
result = mGraphicBufferProducer->requestBuffer(buf, &gbuf);
return result;
}
}
//dequeue返回的release fence是否不为-1。
if (fence->isValid()) {
*fenceFd = fence->dup();
if (*fenceFd == -1) {
ALOGE("dequeueBuffer: error duping fence: %d", errno);
// dup() should never fail; something is badly wrong. Soldier on
// and hope for the best; the worst that should happen is some
// visible corruption that lasts until the next frame.
}
} else {
*fenceFd = -1;
}
*buffer = gbuf.get();
return OK;
}
通过上面的dequeueBuffer获取到了buffer,也经过waitForever,release fence触发了,该buffer能够用来画图了。Canvas的画图过程都是同步的(非硬件加速。底层由skcanvas实现)。画图完毕后,会调用java层surface的unlockCanvasAndPost
函数,终于会调用native层的unlockAndPost,函数中会调用queueBuffer。传入的fencefd为-1。为何?引文canvas的画图是同步的,到这里上层画图已经完毕了,不须要acquire fence。
status_t Surface::unlockAndPost()
{
if (mLockedBuffer == 0) {
ALOGE("Surface::unlockAndPost failed, no locked buffer");
return INVALID_OPERATION;
}
status_t err = mLockedBuffer->unlock();
ALOGE_IF(err, "failed unlocking buffer (%p)", mLockedBuffer->handle);
//调用queueBuffer,fencefd为-1。由于canvas是同步画图的
err = queueBuffer(mLockedBuffer.get(), -1);
ALOGE_IF(err, "queueBuffer (handle=%p) failed (%s)",
mLockedBuffer->handle, strerror(-err));
mPostedBuffer = mLockedBuffer;
mLockedBuffer = 0;
return err;
}
int Surface::queueBuffer(android_native_buffer_t* buffer, int fenceFd) {
ATRACE_CALL();
ALOGV("Surface::queueBuffer");
Mutex::Autolock lock(mMutex);
int64_t timestamp;
bool isAutoTimestamp = false;
if (mTimestamp == NATIVE_WINDOW_TIMESTAMP_AUTO) {
timestamp = systemTime(SYSTEM_TIME_MONOTONIC);
isAutoTimestamp = true;
ALOGV("Surface::queueBuffer making up timestamp: %.2f ms",
timestamp / 1000000.f);
} else {
timestamp = mTimestamp;
}
int i = getSlotFromBufferLocked(buffer);
if (i < 0) {
return i;
}
// Make sure the crop rectangle is entirely inside the buffer.
Rect crop;
mCrop.intersect(Rect(buffer->width, buffer->height), &crop);
//queueBuffer到BufferQueue中。带了一个fencefd为-1的fence
sp<Fence> fence(fenceFd >= 0 ?
new Fence(fenceFd) : Fence::NO_FENCE);
IGraphicBufferProducer::QueueBufferOutput output;
IGraphicBufferProducer::QueueBufferInput input(timestamp, isAutoTimestamp,
crop, mScalingMode, mTransform, mSwapIntervalZero, fence);
status_t err = mGraphicBufferProducer->queueBuffer(i, input, &output);
return err;
}
通过Surface::queueBuffer已经将画好的buffer queue进了BufferQueue。
上层使用opengl画图
在前面讲过,opengl画图时先会用egl创建本地环境。然后调用gl相关画图命令。最后调用eglSwapBuffers()去queue和deque buffer。可是egl、opengl的详细实现和GPU硬件相关,硬件厂商一般都不会给我们提供源码,仅仅有so库,所以以下分析使用agl的实现代码。仅仅是为了说明fence的创建、传递以及wait。
eglCreateWindowSurface()函数,
/*--------------libagl\egl.java---------------------------*/
EGLSurface eglCreateWindowSurface( EGLDisplay dpy, EGLConfig config,
NativeWindowType window,
const EGLint *attrib_list)
{
return createWindowSurface(dpy, config, window, attrib_list);
}
static EGLSurface createWindowSurface(EGLDisplay dpy, EGLConfig config,
NativeWindowType window, const EGLint *attrib_list)
{
// egl_surface_t的详细实现为egl_window_surface_v2_t
egl_surface_t* surface;
surface = new egl_window_surface_v2_t(dpy, config, depthFormat,
static_cast<ANativeWindow*>(window));
return surface;
}
swapBuffers()会触发dequeue和queue buffer,由于看不到详细实现代码,所以仅仅能通过抓log推測实现。
EGLBoolean egl_window_surface_v2_t::swapBuffers()
{
unlock(buffer);
previousBuffer = buffer;
//在eglMakeCurrent中,会先去调用connect函数,当中会dequeue第一个buffer
//调用surface的queueBuffer,agl的实现fencefd输入为-1
//通过抓取高通平台log,queue buffer后fencefd均不为-1。
//肯定在前面先创建了fence同步对象,经过merge后肯定不再为-1了
nativeWindow->queueBuffer(nativeWindow, buffer, -1);
buffer = 0;
// dequeue a new buffer
int fenceFd = -1;
//第一次被申请的buffer slot。返回-1
//假设不是。有release fence,则会dup该fencefd
if (nativeWindow->dequeueBuffer(nativeWindow, &buffer, &fenceFd) == NO_ERROR) {
sp<Fence> fence(new Fence(fenceFd));
//wait,等待release fence触发
if (fence->wait(Fence::TIMEOUT_NEVER)) {
nativeWindow->cancelBuffer(nativeWindow, buffer, fenceFd);
return setError(EGL_BAD_ALLOC, EGL_FALSE);
}
// reallocate the depth-buffer if needed
if ((width != buffer->width) || (height != buffer->height)) {
// TODO: we probably should reset the swap rect here
// if the window size has changed
width = buffer->width;
height = buffer->height;
if (depth.data) {
free(depth.data);
depth.width = width;
depth.height = height;
depth.stride = buffer->stride;
depth.data = (GGLubyte*)malloc(depth.stride*depth.height*2);
if (depth.data == 0) {
setError(EGL_BAD_ALLOC, EGL_FALSE);
return EGL_FALSE;
}
}
}
// keep a reference on the buffer
buffer->common.incRef(&buffer->common);
// finally pin the buffer down
if (lock(buffer, GRALLOC_USAGE_SW_READ_OFTEN |
GRALLOC_USAGE_SW_WRITE_OFTEN, &bits) != NO_ERROR) {
ALOGE("eglSwapBuffers() failed to lock buffer %p (%ux%u)",
buffer, buffer->width, buffer->height);
return setError(EGL_BAD_ACCESS, EGL_FALSE);
// FIXME: we should make sure we're not accessing the buffer anymore
}
} else {
return setError(EGL_BAD_CURRENT_SURFACE, EGL_FALSE);
}
return EGL_TRUE;
}
下层合成
上面经过两种不同的画图方式。经过queue buffer后:
a, canvas的acquire fencefd为-1,由于画图是异步的;
b, opengl的acquire fencefd肯定不为-1。由于opengl画图是异步的嘛;
layer中的SurfaceFlingerConsumer做为消费者(和surface相应)。当queue buffer到BufferQueue时终于会触发layer的onFrameAvailable()函数,而该函数会触发一次surfaceflinger的vsync事件。
updateTexImage
surfaceflinger在处理vsync信号的时候。运行handleMessageInvalidate()—>handlePageFlip()—>layer->latchBuffer()。在latchBuffer中首先会调用mSurfaceFlingerConsumer->updateTexImage()。
Region Layer::latchBuffer(bool& recomputeVisibleRegions)
{
ATRACE_CALL();
Region outDirtyRegion;
if (mQueuedFrames > 0) {
sp<GraphicBuffer> oldActiveBuffer = mActiveBuffer;
Reject r(mDrawingState, getCurrentState(), recomputeVisibleRegions);
//①调用updateTexImage
status_t updateResult = mSurfaceFlingerConsumer->updateTexImage(&r);
// update the active buffer
mActiveBuffer = mSurfaceFlingerConsumer->getCurrentBuffer();
if (mActiveBuffer == NULL) {
// this can only happen if the very first buffer was rejected.
return outDirtyRegion;
}
return outDirtyRegion;
}
在这里面会去acquire buffer,然后acquire到的buffer就会去用来合成。主要做的事情:
a, acquire一个新的buffer;
b, 将上一次相应的buffer先release了,并为上次的buffer创建一个release fence。将该release fence传递给BufferQueue中的slot相应的mSlots[slot]的mFence。
c, 更新mCurrentTexture和mCurrentTextureBuf为这次acquire到的buffer以及slot。
眼下acquire fencefd还没使用,由于还未去合成这个layer,没到用layer中数据的时候。
status_t SurfaceFlingerConsumer::updateTexImage(BufferRejecter* rejecter)
{
BufferQueue::BufferItem item;
// Acquire the next buffer.
// In asynchronous mode the list is guaranteed to be one buffer
// deep, while in synchronous mode we use the oldest buffer.
//① 首先acquireBuffer
// a, 假设上层canvas画图。获取到的fencefd为-1
// b, 上层opengl画图。获取到的fencefd不为-1
err = acquireBufferLocked(&item, computeExpectedPresent());
//② 每次仅仅能处理一个graphic buffer,要将上一次相应的buffer先release了。供别人使用
//首先创建一个release fence,
//将release fence传递给BufferQueue中的slot相应的mSlots[slot]的mFence
// Release the previous buffer.
err = updateAndReleaseLocked(item);
//③ 4.4已经不走这个if了。会在Layer::onDraw中去创建纹理
if (!SyncFeatures::getInstance().useNativeFenceSync()) {
// Bind the new buffer to the GL texture.
//
// Older devices require the "implicit" synchronization provided
// by glEGLImageTargetTexture2DOES, which this method calls. Newer
// devices will either call this in Layer::onDraw, or (if it's not
// a GL-composited layer) not at all.
err = bindTextureImageLocked();
}
return err;
}
status_t GLConsumer::updateAndReleaseLocked(const BufferQueue::BufferItem& item)
{
int buf = item.mBuf;
// 假设mEglSlots[buf]相应的EGLImageKHR 没创建。先创建
// If the mEglSlot entry is empty, create an EGLImage for the gralloc
// buffer currently in the slot in ConsumerBase.
//
// We may have to do this even when item.mGraphicBuffer == NULL (which
// means the buffer was previously acquired), if we destroyed the
// EGLImage when detaching from a context but the buffer has not been
// re-allocated.
if (mEglSlots[buf].mEglImage == EGL_NO_IMAGE_KHR) {
EGLImageKHR image = createImage(mEglDisplay,
mSlots[buf].mGraphicBuffer, item.mCrop);
mEglSlots[buf].mEglImage = image;
mEglSlots[buf].mCropRect = item.mCrop;
}
// 在释放老的buffer前,先给加入一个release fence。有可能还在使用
// Do whatever sync ops we need to do before releasing the old slot.
err = syncForReleaseLocked(mEglDisplay);
// 先把老的buffer。release了
// 假设是第一次为mCurrentTexture为BufferQueue::INVALID_BUFFER_SLOT。-1
// 将release fence传递给BufferQueue中的slot相应的mSlots[slot]的mFence
// release old buffer
if (mCurrentTexture != BufferQueue::INVALID_BUFFER_SLOT) {
status_t status = releaseBufferLocked(
mCurrentTexture, mCurrentTextureBuf, mEglDisplay,
mEglSlots[mCurrentTexture].mEglFence);
if (status < NO_ERROR) {
ST_LOGE("updateAndRelease: failed to release buffer: %s (%d)",
strerror(-status), status);
err = status;
// keep going, with error raised [?]
}
}
//更新这次acquire到的buffer到mCurrentTexture和mCurrentTextureBuf
// Update the GLConsumer state.
mCurrentTexture = buf;
mCurrentTextureBuf = mSlots[buf].mGraphicBuffer;
mCurrentCrop = item.mCrop;
mCurrentTransform = item.mTransform;
mCurrentScalingMode = item.mScalingMode;
mCurrentTimestamp = item.mTimestamp;
//这个就是生产者传过来的acquire fence
mCurrentFence = item.mFence;
mCurrentFrameNumber = item.mFrameNumber;
computeCurrentTransformMatrixLocked();
return err;
}
status_t GLConsumer::syncForReleaseLocked(EGLDisplay dpy) {
ST_LOGV("syncForReleaseLocked");
//仅仅有第一次mCurrentTexture会为-1
//创建一个release fence
if (mCurrentTexture != BufferQueue::INVALID_BUFFER_SLOT) {
if (SyncFeatures::getInstance().useNativeFenceSync()) {
EGLSyncKHR sync = eglCreateSyncKHR(dpy,
EGL_SYNC_NATIVE_FENCE_ANDROID, NULL);
if (sync == EGL_NO_SYNC_KHR) {
ST_LOGE("syncForReleaseLocked: error creating EGL fence: %#x",
eglGetError());
return UNKNOWN_ERROR;
}
glFlush();
int fenceFd = eglDupNativeFenceFDANDROID(dpy, sync);
eglDestroySyncKHR(dpy, sync);
if (fenceFd == EGL_NO_NATIVE_FENCE_FD_ANDROID) {
ST_LOGE("syncForReleaseLocked: error dup'ing native fence "
"fd: %#x", eglGetError());
return UNKNOWN_ERROR;
}
sp<Fence> fence(new Fence(fenceFd));
status_t err = addReleaseFenceLocked(mCurrentTexture,
mCurrentTextureBuf, fence);
if (err != OK) {
ST_LOGE("syncForReleaseLocked: error adding release fence: "
"%s (%d)", strerror(-err), err);
return err;
}
} else if (mUseFenceSync && SyncFeatures::getInstance().useFenceSync()) {
//不会走到这
}
return OK;
}
status_t ConsumerBase::addReleaseFenceLocked(int slot,
const sp<GraphicBuffer> graphicBuffer, const sp<Fence>& fence) {
//给mSlots[slot].mFence加入fence
if (!mSlots[slot].mFence.get()) {
mSlots[slot].mFence = fence;
} else {
sp<Fence> mergedFence = Fence::merge(
String8::format("%.28s:%d", mName.string(), slot),
mSlots[slot].mFence, fence);
if (!mergedFence.get()) {
CB_LOGE("failed to merge release fences");
// synchronization is broken, the best we can do is hope fences
// signal in order so the new fence will act like a union
mSlots[slot].mFence = fence;
return BAD_VALUE;
}
mSlots[slot].mFence = mergedFence;
}
return OK;
}
status_t GLConsumer::releaseBufferLocked(int buf,
sp<GraphicBuffer> graphicBuffer,
EGLDisplay display, EGLSyncKHR eglFence) {
// release the buffer if it hasn't already been discarded by the
// BufferQueue. This can happen, for example, when the producer of this
// buffer has reallocated the original buffer slot after this buffer
// was acquired.
status_t err = ConsumerBase::releaseBufferLocked(
buf, graphicBuffer, display, eglFence);
mEglSlots[buf].mEglFence = EGL_NO_SYNC_KHR;
return err;
}
status_t ConsumerBase::releaseBufferLocked(
int slot, const sp<GraphicBuffer> graphicBuffer,
EGLDisplay display, EGLSyncKHR eglFence) {
// If consumer no longer tracks this graphicBuffer (we received a new
// buffer on the same slot), the buffer producer is definitely no longer
// tracking it.
if (!stillTracking(slot, graphicBuffer)) {
return OK;
}
//release 老buffer的时候,会传入一个mSlots[slot].mFence,即release fence
status_t err = mConsumer->releaseBuffer(slot, mSlots[slot].mFrameNumber,
display, eglFence, mSlots[slot].mFence);
if (err == BufferQueue::STALE_BUFFER_SLOT) {
freeBufferLocked(slot);
}
mSlots[slot].mFence = Fence::NO_FENCE;
return err;
}
doComposeSurfaces
接着surfaceflinger会去进行layer的合成,仅仅合成HWC_FRAMEBUFFER的layer,对overlay合成的layer直接将layer的acquire fence 设置到hwcomposer中的hwc_layer_1_t结构中。
void SurfaceFlinger::doComposeSurfaces(const sp<const DisplayDevice>& hw, const Region& dirty)
{
/*
* and then, render the layers targeted at the framebuffer
*/
const Vector< sp<Layer> >& layers(hw->getVisibleLayersSortedByZ());
const size_t count = layers.size();
const Transform& tr = hw->getTransform();
if (cur != end) {
// we're using h/w composer
for (size_t i=0 ; i<count && cur!=end ; ++i, ++cur) {
const sp<Layer>& layer(layers[i]);
const Region clip(dirty.intersect(tr.transform(layer->visibleRegion)));
if (!clip.isEmpty()) {
switch (cur->getCompositionType()) {
//overlay不做处理
case HWC_OVERLAY: {
const Layer::State& state(layer->getDrawingState());
if ((cur->getHints() & HWC_HINT_CLEAR_FB)
&& i
&& layer->isOpaque() && (state.alpha == 0xFF)
&& hasGlesComposition) {
// never clear the very first layer since we're
// guaranteed the FB is already cleared
layer->clearWithOpenGL(hw, clip);
}
break;
}
//surfaceflinger对HWC_FRAMEBUFFER相应的layer用opengl合成
case HWC_FRAMEBUFFER: {
layer->draw(hw, clip);
break;
}
case HWC_FRAMEBUFFER_TARGET: {
// this should not happen as the iterator shouldn't
// let us get there.
ALOGW("HWC_FRAMEBUFFER_TARGET found in hwc list (index=%d)", i);
break;
}
}
}
layer->setAcquireFence(hw, *cur);
}
} else {
}
}
//注意參数layer为HWComposer::HWCLayerInterface,
//直接将layer的acquire fence 设置到hwcomposer中的hwc_layer_1_t结构中
void Layer::setAcquireFence(const sp<const DisplayDevice>& hw,
HWComposer::HWCLayerInterface& layer) {
int fenceFd = -1;
// TODO: there is a possible optimization here: we only need to set the
// acquire fence the first time a new buffer is acquired on EACH display.
// 对于overlay层,由于要去hwcomposer合成。先获取(surface--layer相应的BufferQueue)
// 消费者中的acquire fence
if (layer.getCompositionType() == HWC_OVERLAY) {
sp<Fence> fence = mSurfaceFlingerConsumer->getCurrentFence();
if (fence->isValid()) {
fenceFd = fence->dup();
if (fenceFd == -1) {
ALOGW("failed to dup layer fence, skipping sync: %d", errno);
}
}
}
//overlay层设置的fence,fencefd不为-1,假设上层是canvas画图。这里也是-1。假设是opengl,
//没准这时候生产者还没完毕画图呢。
//其余层都设置的fence。fencefd为-1
layer.setAcquireFenceFd(fenceFd);
}
// layer->draw —>layer->onDraw
void Layer::onDraw(const sp<const DisplayDevice>& hw, const Region& clip) const
{
ATRACE_CALL();
// 首先,将buffer加入到GL texture
// Bind the current buffer to the GL texture, and wait for it to be
// ready for us to draw into.
status_t err = mSurfaceFlingerConsumer->bindTextureImage();
//利用opengl合成
drawWithOpenGL(hw, clip);
}
//SurfaceFlingerConsumer::bindTextureImage—>GLConsumer::bindTextureImageLocked
status_t GLConsumer::bindTextureImageLocked() {
if (mEglDisplay == EGL_NO_DISPLAY) {
ALOGE("bindTextureImage: invalid display");
return INVALID_OPERATION;
}
GLint error;
while ((error = glGetError()) != GL_NO_ERROR) {
ST_LOGW("bindTextureImage: clearing GL error: %#04x", error);
}
glBindTexture(mTexTarget, mTexName);
if (mCurrentTexture == BufferQueue::INVALID_BUFFER_SLOT) {
if (mCurrentTextureBuf == NULL) {
ST_LOGE("bindTextureImage: no currently-bound texture");
return NO_INIT;
}
status_t err = bindUnslottedBufferLocked(mEglDisplay);
if (err != NO_ERROR) {
return err;
}
} else {
//获取当前mCurrentTexture的mEglSlots相应的EGLImageKHR
EGLImageKHR image = mEglSlots[mCurrentTexture].mEglImage;
glEGLImageTargetTexture2DOES(mTexTarget, (GLeglImageOES)image);
while ((error = glGetError()) != GL_NO_ERROR) {
ST_LOGE("bindTextureImage: error binding external texture image %p"
": %#04x", image, error);
return UNKNOWN_ERROR;
}
}
//等待这个buffer的acquire fence触发。也就是等待画完
// Wait for the new buffer to be ready.
return doGLFenceWaitLocked();
}
// 等待生产者的acquire fence触发
status_t GLConsumer::doGLFenceWaitLocked() const {
EGLDisplay dpy = eglGetCurrentDisplay();
EGLContext ctx = eglGetCurrentContext();
if (mEglDisplay != dpy || mEglDisplay == EGL_NO_DISPLAY) {
ST_LOGE("doGLFenceWait: invalid current EGLDisplay");
return INVALID_OPERATION;
}
if (mEglContext != ctx || mEglContext == EGL_NO_CONTEXT) {
ST_LOGE("doGLFenceWait: invalid current EGLContext");
return INVALID_OPERATION;
}
//等待生产者的acquire fence触发,
if (mCurrentFence->isValid()) {
if (SyncFeatures::getInstance().useWaitSync()) {
// Create an EGLSyncKHR from the current fence.
int fenceFd = mCurrentFence->dup();
if (fenceFd == -1) {
ST_LOGE("doGLFenceWait: error dup'ing fence fd: %d", errno);
return -errno;
}
EGLint attribs[] = {
EGL_SYNC_NATIVE_FENCE_FD_ANDROID, fenceFd,
EGL_NONE
};
EGLSyncKHR sync = eglCreateSyncKHR(dpy,
EGL_SYNC_NATIVE_FENCE_ANDROID, attribs);
if (sync == EGL_NO_SYNC_KHR) {
close(fenceFd);
ST_LOGE("doGLFenceWait: error creating EGL fence: %#x",
eglGetError());
return UNKNOWN_ERROR;
}
// XXX: The spec draft is inconsistent as to whether this should
// return an EGLint or void. Ignore the return value for now, as
// it's not strictly needed.
eglWaitSyncKHR(dpy, sync, 0);
EGLint eglErr = eglGetError();
eglDestroySyncKHR(dpy, sync);
if (eglErr != EGL_SUCCESS) {
ST_LOGE("doGLFenceWait: error waiting for EGL fence: %#x",
eglErr);
return UNKNOWN_ERROR;
}
} else {
status_t err = mCurrentFence->waitForever(
"GLConsumer::doGLFenceWaitLocked");
if (err != NO_ERROR) {
ST_LOGE("doGLFenceWait: error waiting for fence: %d", err);
return err;
}
}
}
return NO_ERROR;
}
至此,surfaceflinger完毕了对HWC_FRAMEBUFFER layer的合成。由于使用的是opengl,所以肯定依旧得fence去协调GPU和CPU的工作。
DisplayDevice和FramebufferSurface
对使用opengl合成的layer将合成结果放置到HWC_FRAMEBUFFER_TARGET layer中,然后再交给HWComposer处理。在surfaceflinger的init函数中,定义了HWC_FRAMEBUFFER_TARGET layer合成时相应的生成者和消费者,每一个display相应有一个DisplayDevice作为生产者(opengl合成数据),而FramebufferSurface是相应的消费者(注意这个消费者仅仅是处理opengl合成相关的,overlay全然由HAL层的hwcomposer处理)。
// BufferQueue
sp<BufferQueue> bq = new BufferQueue(new GraphicBufferAlloc());
//FramebufferSurface是合成的消费者,相应bq的消费端
sp<FramebufferSurface> fbs = new FramebufferSurface(*mHwc, i, bq);
// DisplayDevice是合成数据的生产者,相应bq的生成端
sp<DisplayDevice> hw = new DisplayDevice(this,
type, allocateHwcDisplayId(type), isSecure, token,
fbs, bq,
mEGLConfig);
DisplayDevice::DisplayDevice(
const sp<SurfaceFlinger>& flinger,
DisplayType type,
int32_t hwcId,
bool isSecure,
const wp<IBinder>& displayToken,
const sp<DisplaySurface>& displaySurface,
const sp<IGraphicBufferProducer>& producer,
EGLConfig config)
: mFlinger(flinger),
mType(type), mHwcDisplayId(hwcId),
mDisplayToken(displayToken),
mDisplaySurface(displaySurface),
mDisplay(EGL_NO_DISPLAY),
mSurface(EGL_NO_SURFACE),
mDisplayWidth(), mDisplayHeight(), mFormat(),
mFlags(),
mPageFlipCount(),
mIsSecure(isSecure),
mSecureLayerVisible(false),
mScreenAcquired(false),
mLayerStack(NO_LAYER_STACK),
mOrientation()
{
//利用bq创建surface
mNativeWindow = new Surface(producer, false);
ANativeWindow* const window = mNativeWindow.get();
int format;
window->query(window, NATIVE_WINDOW_FORMAT, &format);
// Make sure that composition can never be stalled by a virtual display
// consumer that isn't processing buffers fast enough. We have to do this
// in two places:
// * Here, in case the display is composed entirely by HWC.
// * In makeCurrent(), using eglSwapInterval. Some EGL drivers set the
// window's swap interval in eglMakeCurrent, so they'll override the
// interval we set here.
if (mType >= DisplayDevice::DISPLAY_VIRTUAL)
window->setSwapInterval(window, 0);
/*
* Create our display's surface
*/
// 利用EGL创建本地opengl环境,要用opengl 合成layer
EGLSurface surface;
EGLint w, h;
EGLDisplay display = eglGetDisplay(EGL_DEFAULT_DISPLAY);
surface = eglCreateWindowSurface(display, config, window, NULL);
eglQuerySurface(display, surface, EGL_WIDTH, &mDisplayWidth);
eglQuerySurface(display, surface, EGL_HEIGHT, &mDisplayHeight);
mDisplay = display;
mSurface = surface;
mFormat = format;
mPageFlipCount = 0;
mViewport.makeInvalid();
mFrame.makeInvalid();
// virtual displays are always considered enabled
mScreenAcquired = (mType >= DisplayDevice::DISPLAY_VIRTUAL);
// Name the display. The name will be replaced shortly if the display
// was created with createDisplay().
switch (mType) {
case DISPLAY_PRIMARY:
mDisplayName = "Built-in Screen";
break;
case DISPLAY_EXTERNAL:
mDisplayName = "HDMI Screen";
break;
default:
mDisplayName = "Virtual Screen"; // e.g. Overlay #n
break;
}
}
在opengl完毕layer的合成后,调用SurfaceFlinger::doDisplayComposition—>hw->swapBuffers()—>DisplayDevice::swapBuffers—>eglSwapBuffers(),
void DisplayDevice::swapBuffers(HWComposer& hwc) const {
// We need to call eglSwapBuffers() if:
// (1) we don't have a hardware composer, or
// (2) we did GLES composition this frame, and either
// (a) we have framebuffer target support (not present on legacy
// devices, where HWComposer::commit() handles things); or
// (b) this is a virtual display
if (hwc.initCheck() != NO_ERROR ||
(hwc.hasGlesComposition(mHwcDisplayId) &&
(hwc.supportsFramebufferTarget() || mType >= DISPLAY_VIRTUAL))) {
//调用eglSwapBuffers去交换buffer
EGLBoolean success = eglSwapBuffers(mDisplay, mSurface);
if (!success) {
EGLint error = eglGetError();
if (error == EGL_CONTEXT_LOST ||
mType == DisplayDevice::DISPLAY_PRIMARY) {
LOG_ALWAYS_FATAL("eglSwapBuffers(%p, %p) failed with 0x%08x",
mDisplay, mSurface, error);
} else {
ALOGE("eglSwapBuffers(%p, %p) failed with 0x%08x",
mDisplay, mSurface, error);
}
}
}
}
前面的文章提到过。eglSwapBuffers会触发DisplayDevice这个producer去dequeue buffer和queue buffer。这里的opengl合成和上层的opengl画图相似,在queuebuffer中就会为该buffer设置一个acquire buffer,传递给消费者FramebufferSurface。queue buffer会触发FramebufferSurface::onFrameAvailable()。
void FramebufferSurface::onFrameAvailable() {
sp<GraphicBuffer> buf;
sp<Fence> acquireFence;
// acquire buffer,由于surfaceflinger是用opengl合成HWC_FRAMEBUFFER_TARGET layer的。
// 所以有可能“合成”这个生产还未完毕。获取queue buffer时设置的fence
status_t err = nextBuffer(buf, acquireFence);
if (err != NO_ERROR) {
ALOGE("error latching nnext FramebufferSurface buffer: %s (%d)",
strerror(-err), err);
return;
}
err = mHwc.fbPost(mDisplayType, acquireFence, buf);
if (err != NO_ERROR) {
ALOGE("error posting framebuffer: %d", err);
}
}
status_t FramebufferSurface::nextBuffer(sp<GraphicBuffer>& outBuffer, sp<Fence>& outFence) {
Mutex::Autolock lock(mMutex);
BufferQueue::BufferItem item;
//首先acquire buffer
status_t err = acquireBufferLocked(&item, 0);
//把老的buffer先release掉,还给BufferQueue,release时肯定得加入个release fence
// If the BufferQueue has freed and reallocated a buffer in mCurrentSlot
// then we may have acquired the slot we already own. If we had released
// our current buffer before we call acquireBuffer then that release call
// would have returned STALE_BUFFER_SLOT, and we would have called
// freeBufferLocked on that slot. Because the buffer slot has already
// been overwritten with the new buffer all we have to do is skip the
// releaseBuffer call and we should be in the same state we'd be in if we
// had released the old buffer first.
if (mCurrentBufferSlot != BufferQueue::INVALID_BUFFER_SLOT &&
item.mBuf != mCurrentBufferSlot) {
// Release the previous buffer.
err = releaseBufferLocked(mCurrentBufferSlot, mCurrentBuffer,
EGL_NO_DISPLAY, EGL_NO_SYNC_KHR);
if (err < NO_ERROR) {
ALOGE("error releasing buffer: %s (%d)", strerror(-err), err);
return err;
}
}
mCurrentBufferSlot = item.mBuf;
mCurrentBuffer = mSlots[mCurrentBufferSlot].mGraphicBuffer;
outFence = item.mFence;
outBuffer = mCurrentBuffer;
return NO_ERROR;
}
status_t ConsumerBase::releaseBufferLocked(
int slot, const sp<GraphicBuffer> graphicBuffer,
EGLDisplay display, EGLSyncKHR eglFence) {
// If consumer no longer tracks this graphicBuffer (we received a new
// buffer on the same slot), the buffer producer is definitely no longer
// tracking it.
if (!stillTracking(slot, graphicBuffer)) {
return OK;
}
CB_LOGV("releaseBufferLocked: slot=%d/%llu",
slot, mSlots[slot].mFrameNumber);
//mSlots[slot].mFence这个release fence传回给BufferQueue
//这个release fence是在哪里设置的呢?应该是hwcomposer底层设置的。然后通过
//postFramebuffer()——>hw->onSwapBuffersCompleted()将release fence
//设置到相应的slot中,
status_t err = mConsumer->releaseBuffer(slot, mSlots[slot].mFrameNumber,
display, eglFence, mSlots[slot].mFence);
if (err == BufferQueue::STALE_BUFFER_SLOT) {
freeBufferLocked(slot);
}
mSlots[slot].mFence = Fence::NO_FENCE;
return err;
}
void SurfaceFlinger::postFramebuffer()
{
ATRACE_CALL();
const nsecs_t now = systemTime();
mDebugInSwapBuffers = now;
HWComposer& hwc(getHwComposer());
if (hwc.initCheck() == NO_ERROR) {
hwc.commit();
}
// make the default display current because the VirtualDisplayDevice code cannot
// deal with dequeueBuffer() being called outside of the composition loop; however
// the code below can call glFlush() which is allowed (and does in some case) call
// dequeueBuffer().
getDefaultDisplayDevice()->makeCurrent(mEGLDisplay, mEGLContext);
for (size_t dpy=0 ; dpy<mDisplays.size() ; dpy++) {
sp<const DisplayDevice> hw(mDisplays[dpy]);
const Vector< sp<Layer> >& currentLayers(hw->getVisibleLayersSortedByZ());
//看名字就是完毕了framebuffertarget layer的swapbuffers。
hw->onSwapBuffersCompleted(hwc);
const size_t count = currentLayers.size();
int32_t id = hw->getHwcDisplayId();
if (id >=0 && hwc.initCheck() == NO_ERROR) {
HWComposer::LayerListIterator cur = hwc.begin(id);
const HWComposer::LayerListIterator end = hwc.end(id);
for (size_t i = 0; cur != end && i < count; ++i, ++cur) {
currentLayers[i]->onLayerDisplayed(hw, &*cur);
}
} else {
for (size_t i = 0; i < count; i++) {
currentLayers[i]->onLayerDisplayed(hw, NULL);
}
}
}
}
void DisplayDevice::onSwapBuffersCompleted(HWComposer& hwc) const {
if (hwc.initCheck() == NO_ERROR) {
mDisplaySurface->onFrameCommitted();
}
}
// onFrameCommitted主要就是获取hwcomposer设置的release fence,然后设置到slot中
void FramebufferSurface::onFrameCommitted() {
sp<Fence> fence = mHwc.getAndResetReleaseFence(mDisplayType);
if (fence->isValid() &&
mCurrentBufferSlot != BufferQueue::INVALID_BUFFER_SLOT) {
status_t err = addReleaseFence(mCurrentBufferSlot,
mCurrentBuffer, fence);
ALOGE_IF(err, "setReleaseFenceFd: failed to add the fence: %s (%d)",
strerror(-err), err);
}
}
sp<Fence> HWComposer::getAndResetReleaseFence(int32_t id) {
if (uint32_t(id)>31 || !mAllocatedDisplayIDs.hasBit(id))
return Fence::NO_FENCE;
int fd = INVALID_OPERATION;
if (mHwc && hwcHasApiVersion(mHwc, HWC_DEVICE_API_VERSION_1_1)) {
const DisplayData& disp(mDisplayData[id]);
// 这里的disp.framebufferTarget->releaseFenceFd应该就是底层hwcomposer设置的
if (disp.framebufferTarget) {
fd = disp.framebufferTarget->releaseFenceFd;
disp.framebufferTarget->acquireFenceFd = -1;
disp.framebufferTarget->releaseFenceFd = -1;
}
}
return fd >= 0 ?
new Fence(fd) : Fence::NO_FENCE;
}
acquire buffer后。调用HWComposer的fbPost函数,设置framebufferTarget的buffer handle以及framebufferTarget layer相应的acquireFenceFd。
int HWComposer::fbPost(int32_t id,
const sp<Fence>& acquireFence, const sp<GraphicBuffer>& buffer) {
if (mHwc && hwcHasApiVersion(mHwc, HWC_DEVICE_API_VERSION_1_1)) {
//4.4走这个分支
return setFramebufferTarget(id, acquireFence, buffer);
} else {
acquireFence->waitForever("HWComposer::fbPost");
return mFbDev->post(mFbDev, buffer->handle);
}
}
// 设置display相应的framebufferTarget->handle和framebufferTarget->acquireFenceFd
status_t HWComposer::setFramebufferTarget(int32_t id,
const sp<Fence>& acquireFence, const sp<GraphicBuffer>& buf) {
if (uint32_t(id)>31 || !mAllocatedDisplayIDs.hasBit(id)) {
return BAD_INDEX;
}
DisplayData& disp(mDisplayData[id]);
if (!disp.framebufferTarget) {
// this should never happen, but apparently eglCreateWindowSurface()
// triggers a Surface::queueBuffer() on some
// devices (!?
) -- log and ignore.
ALOGE("HWComposer: framebufferTarget is null");
return NO_ERROR;
}
// 假设acquireFence fencefd不等于-1,也就是说opengl使用了硬件实现去合成layer
int acquireFenceFd = -1;
if (acquireFence->isValid()) {
acquireFenceFd = acquireFence->dup();
}
//设置framebufferTarget的buffer handle以及framebufferTarget layer相应的acquireFenceFd
// ALOGD("fbPost: handle=%p, fence=%d", buf->handle, acquireFenceFd);
disp.fbTargetHandle = buf->handle;
disp.framebufferTarget->handle = disp.fbTargetHandle;
disp.framebufferTarget->acquireFenceFd = acquireFenceFd;
return NO_ERROR;
}
而对overlay相应的layer而言,前面仅仅设置了acquire fence,在hwcomposer HAL处理后肯定会给加入一个release fence,而这一部分代码我们看不到实现。那么这个release fence是怎样设置到layer中的?
void SurfaceFlinger::postFramebuffer()
{
for (size_t dpy=0 ; dpy<mDisplays.size() ; dpy++) {
sp<const DisplayDevice> hw(mDisplays[dpy]);
const Vector< sp<Layer> >& currentLayers(hw->getVisibleLayersSortedByZ());
hw->onSwapBuffersCompleted(hwc);
const size_t count = currentLayers.size();
int32_t id = hw->getHwcDisplayId();
if (id >=0 && hwc.initCheck() == NO_ERROR) {
HWComposer::LayerListIterator cur = hwc.begin(id);
const HWComposer::LayerListIterator end = hwc.end(id);
//对所有的layer运行onLayerDisplayed。设置release fence,
//当hwcomposer将layer合成完毕后。这个release fence就会触发。
for (size_t i = 0; cur != end && i < count; ++i, ++cur) {
currentLayers[i]->onLayerDisplayed(hw, &*cur);
}
} else {
for (size_t i = 0; i < count; i++) {
currentLayers[i]->onLayerDisplayed(hw, NULL);
}
}
}
}
void Layer::onLayerDisplayed(const sp<const DisplayDevice>& hw,
HWComposer::HWCLayerInterface* layer) {
if (layer) {
layer->onDisplayed();
//将fence设置到slot中
mSurfaceFlingerConsumer->setReleaseFence(layer->getAndResetReleaseFence());
}
}
virtual sp<Fence> getAndResetReleaseFence() {
//获取layer的releaseFenceFd
int fd = getLayer()->releaseFenceFd;
getLayer()->releaseFenceFd = -1;
//new 一个fence
return fd >= 0 ? new Fence(fd) : Fence::NO_FENCE;
}