Game Engine Architecture 8

时间:2024-06-27 10:06:44

Game Engine Architecture 8

1、Differences across Operating Systems

  • UNIX uses a forward slash (/) as its path component separator, while DOS and older versions of Windows used a backslash (\) as the path separator. Recent versions of Windows allow either forward or backward slashes to be used to separate    path components, although some applications still fail to accept forward slashes.

  • Some filesystems consider paths and filenames to be case-sensitive (like UNIX and its variants), while others are case-insensitive (like Windows).

  • UNIX and its variants don’t support volumes as separate directory hierarchies.

  • On Microsoft Windows, volumes can be specified in two ways. A local disk drive is specified using a single letter followed by a colon (e.g., the ubiquitous C:). A remote network share can either be mounted so that it looks like a local disk, or it can be referenced via a volume specifier consisting of two backslashes followed by the remote computer name and the name of a shared directory or resource on that machine (e.g., \\some-computer\some-share).

2、Search Paths

  A search path is a string containing a list of paths, each separated by a special character such as a colon or semicolon, which is searched when looking for a file.

3、Basic File I/O

  Every file I/O API requires data blocks known as buffers to serve as the source or destination of the bytes passing between the program and the file on disk. We say a file I/O API is buffered when the API manages the necessary input and output data buffers for you. With an unbuffered API, it is the responsibility of the programmer using the API to allocate and manage the data buffers.

  Buffered API 由API来管理buffer,Unbuffered API 由非API来管理。

    Game Engine Architecture 8

  Game Engine Architecture 8

4、The Resource Manager

  Every resource manager is comprised of two distinct but integrated components. One component manages the chain of offline tools used to create the assets and transform them into their engine-ready form. The other component manages the resources at runtime, ensuring that they are loaded into memory in advance of being needed by the game and making sure they are unloaded from memory when no longer needed.

5、OGRE’s Resource Manager System

  OGRE lacks any kind of offline resource database.

6、Resource Dependencies and Build Rules

  If the format of the files used to store triangle meshes changes, for instance, all meshes in the entire game may need to be reexported and/or rebuilt. Some game engines employ data formats that are robust to version changes. For example, an asset may contain a version number, and the game engine may include code that “knows” how to load and make use of legacy assets. The downside of such a policy is that asset files and engine code tend to become bulky. When data format changes are relatively rare, it may be better to just bite the bullet and reprocess all the files when format changes do occur.

  给每个 asset format  一个 version,从而保证老的 format 能够被兼容。

7、Stack-Based Resource Allocation

  On Hydro Thunder, Midway used a double-ended stack allocator. The lower stack was used for persistent data loads, while the upper was used for temporary allocations that were freed every frame. Another way a double ended stack allocator can be used is to ping-pong level loads. Such an approach was used at Bionic Games, Inc. for one of their projects. The basic idea is to load a compressed version of level B into the upper stack, while the currently active level A resides (in uncompressed form) in the lower stack. To switch from level A to level B, we simply free level A’s resources (by clearing the lower stack) and then decompress level B from the upper stack into the lower stack. Decompression is generally much faster than loading data from disk, so this approach effectively eliminates the load time that would otherwise be experienced by the player between levels.

7.1、Sectioned Resource Files

  A typical resource file might contain between one and four sections, each of which is divided into one or more chunks for the purposes of pool allocation as described above. 

  1)One section might contain data that is destined for main RAM,

  2)while another section might contain video RAM data.

  3)Another section could contain temporary data that is needed during the loading process but is discarded once the resource has been completely loaded.

  4)Yet another section might contain debugging information.

8、Handling Cross-References between Resources

  1)GUIDs as Cross-References

  2)Pointer Fix-Up Tables

    Another approach that is often used when storing data objects into a binary file is to convert the pointers into file offsets.

  Game Engine Architecture 8

8.1、Windows Message Pumps  

while (true)
{
// Service any and all pending Windows messages.
MSG msg;
while (PeekMessage(&msg, nullptr, , ) > )
{
TranslateMessage(&msg);
DispatchMessage(&msg);
} // No more Windows messages to process -- run one
// iteration of our "real" game loop.
RunOneIterationOfGameLoop();
}

9、Updating Based on Elapsed Time

  there is one big problem with this technique: We are using the measured value of Δt taken during frame k as an estimate of the duration of the upcoming frame (k +1).  

  Using last frame’s delta as an estimate of the upcoming frame can have some very real detrimental effects. For example, if we’re not careful it can put the game into a “viscious cycle” of poor frame times. Let’s assume that our physics simulation is most stable when updated once every 33.3 ms (i.e., at 30 Hz). If we get one bad frame, taking say 57ms, then we might make the  mistake of stepping the physics system twice on the next frame, presumably to “cover” the 57 ms that has passed. Those two steps take roughly twice as long to complete as a regular step, causing the next frame to be at least as bad as this one was, and possibly worse. This only serves to exacerbate and prolong the problem.

  一帧的delay,会导致雪崩效应。

10、Using a Running Average

  If the camera is pointed down a hallway containing lots of expensive-to-draw objects on one frame, there’s a good chance it will still be pointed down that hallway on the next. Therefore one reasonable approach is to average the frame-time measurements over a small number of frames and use that as the next frame’s estimate of Δt.

11、Measuring Real Time with a High-Resolution Timer

  C standard library function time(). However, such functions are not suitable for measuring elapsed times in a real-time game because they do not provide sufficient resolution. For example, time() returns an integer representing the number of seconds since midnight, January 1, 1970, so its resolution is one second—far too coarse.

  All modern CPUs contain a high-resolution timer, which is usually implemented as a hardware register that counts the number of CPU cycles (or some multiple thereof) that have elapsed since the last time the processor was powered on or reset.

  On a Pentium, a special instruction called rdtsc (read time-stamp counter) can be used, although the Win32 API wraps this facility in a pair of functions: QueryPerformanceCounter() reads the 64-bit counter register and QueryPerformanceFrequency() returns the number of counter increments per second for the current CPU. On a PowerPC architecture, such as the chips found in the Xbox 360 and PlayStation 3, the instruction mftb (move from time base register) can be used to read the two 32-bit time base registers, while on other PowerPC architectures, the instruction mfspr (move from special-purpose register) is used instead.

12、High-Resolution Clock Drift

  on some multicore processors, the high-resolution timers are independent on each core, and they can (and do) drift apart. If you try to compare absolute timer readings taken on different cores to one another, you might end up with some strange results—even negative time deltas. Be sure to keep an eye out for these kinds of problems.

13、Dealing with Breakpoints

  When your game hits a breakpoint, its loop stops running and the debugger takes over. However, if your game is running on the same computer that your debugger is running on, then the CPU continues to run, and the real-time clock continues to accrue cycles. A large amount of wall clock time can pass while you are inspecting your code at a breakpoint. When you allow the program to continue, this can lead to a measured frame time many seconds, or even minutes or hours in duration!

  Game Engine Architecture 8

14、Task Decomposition

  decomposition transforms our software from a sequential program into a concurrent one.

    1)task parallelism

    2)data parallelism

15、One Thread per Subsystem

  Game Engine Architecture 8

  Yet another issue is that some engine subsystems depend on the data produced by others. For example, the rendering and audio subsystems cannot  start doing their work for frame N until the animation, collision and physics systems have completed their work for frame N. We cannot run two subsystems in parallel if they depend on one another like this.

16、Scatter/Gather

  Game Engine Architecture 8

17、Making Scatter/Gather More Efficient

  spawning threads is expensive. Spawning a thread involves a kernel call, as does joining the master thread with its workers. The kernel itself does quite a lot of set-up and teardown work whenever threads come and go. So spawning a bunch of threads every time we want to perform a scatter/gather is impractical.

  We could mitigate the costs of spawning threads by using of a pool of prespawned threads.

18、Typical Job System Interface

  A typical job system provides a simple, easy-to-use API that looks very similar to the API for a threading library.

  1)There’s a function for spawning a job (the equivalent of pthread_create(), often called kicking a job), 2)a function that allows one job to wait for one or more other jobs to terminate (the equivalent of pthread_join()), and 3)perhaps a way for a job to self-terminate “early”(before returning from its entry point function).

  A job system must also provide spin locks of mutexes of some kind for performing critical concurrent operations in an atomic manner. It may also provide facilities for putting jobs to sleep and waking them back up, via condition variables or some similar mechanism.  

namespace job
{
// signature of all job entry points
typedef void EntryPoint(uintptr_t param); // allowable priorities
enum class Priority
{
LOW, NORMAL, HIGH, CRITICAL
}; // counter (implementation not shown)
struct Counter ... ;
Counter* AllocCounter();
void FreeCounter(Counter* pCounter); // simple job declaration
struct Declaration
{
EntryPoint* m_pEntryPoint;
uintptr_t m_param;
Priority m_priority;
Counter* m_pCounter;
}; // kick a job
void KickJob(const Declaration& decl);
void KickJobs(int count, const Declaration aDecl[]); // wait for job to terminate (for its Counter to become zero)
void WaitForCounter(Counter* pCounter); // kick jobs and wait for completion
void KickJobAndWait(const Declaration& decl);
void KickJobsAndWait(int count, const Declaration aDecl[]);
}

19、A Simple Job System Based on a Thread Pool  

namespace job
{
void* JobWorkerThread(void*)
{
// keep on running jobs forever...
while (true)
{
Declaration declCopy; // wait for a job to become available
pthread_mutex_lock(&g_mutex);
while (!g_ready)
{
pthread_cond_wait(&g_jobCv, &g_mutex);
} // copy the JobDeclaration locally and
// release our mutex lock
declCopy = GetNextJobFromQueue();
pthread_mutex_unlock(&g_mutex); // run the job
declCopy.m_pEntryPoint(declCopy.m_param); // job is done! rinse and repeat...
}
}
}

  In our simple thread-pool-based job system, every job must run to completion once it starts running. It cannot “go to sleep” waiting for the results of the ray cast, allowing other jobs to run on the worker thread, and then “wake up” later, when the ray cast results are ready.

20、Jobs as Fibers

  Naughty Dog’s job system is based on fibers.  

21、Job Counters

  Whenever a job is kicked, it can optionally be associated with a counter (provided to it via the job::Declaration). The act of kicking the job increments the counter, and when the job terminates the counter is decremented. Waiting for a batch of jobs, then, involves simply kicking them all off with the same counter, and then waiting until that counter reaches zero (which indicates that all jobs have completed their work.) Waiting until a counter reaches zero is much more efficient than polling the individual jobs, because the check can be made at the moment the counter is decremented. As such, a counter based system can be a performance win. Counters like this are used in the Naughty Dog job system.

22、Job Synchronization Primitives

  The implementation of job synchronization primitives are usually not simply wrappers around the kernel’s thread synchronization primitives. To see why, consider what an OS mutex does: It puts a thread to sleep whenever the lock it’s trying to acquire is already being held by another thread. If we were to implement our job system as a thread pool, then waiting on a mutex within a job would put the entire worker thread to sleep, not just the one job that wants to wait for the lock. Clearly this would pose a serious problem, because no jobs would be able to run on that thread’s core until the thread wakes back up. Such a system would very likely be fraught with deadlock issues.

  To overcome this problem, jobs could use spin locks instead of OS mutexes. This approach works well as long as there’s not very much lock contention between threads, because in that case no job will ever busy-wait for every lock trying to obtain a lock. Naughty Dog’s job system uses spin locks for most of its locking needs.

23、The Naughty Dog Job System

  Fiber creation is slow on the PS4, so a pool of fibers is pre-spawned, along with memory blocks to serve as each fiber’s call stack.

  When a job waits on a counter, the job is put to sleep and its fiber (execution context) is placed on a wait list, along with the counter it is waiting for. When this counter hits zero, the job is woken back up so it can continue where it left off.

24、Digital Buttons

  Game programmers often refer to a pressed button as being down and a non-pressed button as being up.

25、HID Cameras

1)infrared sensor.

  IR sensor. This sensor is essentially a low-resolution camera that records a two-dimension infrared image of whatever the Wiimote is pointed at. The Wii comes with a “sensor bar” that sits on top of your television set and contains two infrared light emitting diodes (LEDs). In the image recorded by the IR camera, these LEDs appear as two bright dots on an otherwise dark background. Image processing software in the Wiimote analyzes the image and isolates the location and size of the two dots.

2)sony high quality camera

  It can be used for simple video conferencing, like any web cam. It could also conceivably be used much like the Wiimote’s IR camera, for position, orientation and depth sensing.

  With the PlayStation 4, Sony has improved the Eye and re-dubbed it the PlayStation Camera. When combined with the PlayStation Move controller (see Figure 9.12) or the DualShock 4 controller, the PlayStation can detect gestures in basically the same way as Microsoft’s innovative Kinect system can.

26、Other Inputs and Outputs

  Innovation is actively taking place in the field of human interfaces. Some of the most interesting areas today are gestural interfaces and thought-controlled devices.

27、Button Up and Button Down

class ButtonState
{
U32 m_buttonStates; // current frame's button states
U32 m_prevButtonStates; // previous frame's states
U32 m_buttonDowns; // 1 = button pressed this frame
U32 m_buttonUps; // 1 = button released this frame void DetectButtonUpDownEvents()
{
// Assuming that m_buttonStates and
// m_prevButtonStates are valid, generate
// m_buttonDowns and m_buttonUps.
// First determine which bits have changed via
// XOR.
U32 buttonChanges = m_buttonStates
^ m_prevButtonStates; // Now use AND to mask off only the bits that
// are DOWN.
m_buttonDowns = buttonChanges & m_buttonStates; // Use AND-NOT to mask off only the bits that
// are UP.
m_buttonUps = buttonChanges & (~m_buttonStates);
} // ...
};

28、chords. 同时按键。

  humans aren’t perfect, and they often press one or more of the buttons in the chord slightly earlier than the rest. So our chord-detection code must be robust to the possibility that we’ll observe one or more individual buttons on frame i and the rest of the chord on frame i + 1 (or even multiple frames later). There are a number of ways to handle this.

29、Rapid Button Tapping

  通过 frequence 来判定 button 是否按的足够的快。

class ButtonTapDetector
{
U32 m_buttonMask; // which button to observe (bit mask)
F32 m_dtMax; // max allowed time between presses
F32 m_tLast; // last button-down event, in seconds public:
// Construct an object that detects rapid tapping of
// the given button (identified by an index).
ButtonTapDetector(U32 buttonId, F32 dtMax) :
m_buttonMask(1U << buttonId),
m_dtMax(dtMax),
m_tLast(CurrentTime() - dtMax) // start out invalid
{
} // Call this at any time to query whether or not
// the gesture is currently being performed.
bool IsGestureValid() const
{
F32 t = CurrentTime();
F32 dt = t - m_tLast;
return (dt < m_dtMax);
} // Call this once per frame.
void Update()
{
if (ButtonsJustWentDown(m_buttonMask))
{
m_tLast = CurrentTime();
}
}
};

30、Multibutton Sequence. A-B-A 序列问题。

class ButtonSequenceDetector
{
U32* m_aButtonIds; // sequence of buttons to watch for
U32 m_buttonCount; // number of buttons in sequence
F32 m_dtMax; // max time for entire sequence
U32 m_iButton; // next button to watch for in seq.
F32 m_tStart; // start time of sequence, in seconds public:
// Construct an object that detects the given button
// sequence. When the sequence is successfully
// detected, the given event is broadcast so that the
// rest of the game can respond in an appropriate way.
ButtonSequenceDetector(U32* aButtonIds,
U32 buttonCount,
F32 dtMax,
EventId eventIdToSend) :
m_aButtonIds(aButtonIds),
m_buttonCount(buttonCount),
m_dtMax(dtMax),
m_eventId(eventIdToSend), // event to send when
// complete
m_iButton(), // start of sequence
m_tStart() // initial value irrelevant
{
} // Call this once per frame.
void Update()
{
ASSERT(m_iButton < m_buttonCount); // Determine which button we're expecting next, as
// a bitmask (shift a 1 up to the correct bit
// index).
U32 buttonMask = (1U << m_aButtonId[m_iButton]); // If any button OTHER than the expected button
// just went down, invalidate the sequence. (Use
// the bitwise NOT operator to check for all other
// buttons.)
if (ButtonsJustWentDown(~buttonMask))
{
m_iButton = ; // reset
} // Otherwise, if the expected button just went
// down, check dt and update our state appropriately.
else if (ButtonsJustWentDown(buttonMask))
{
if (m_iButton == )
{
// This is the first button in the
// sequence.
m_tStart = CurrentTime();
m_iButton++; // advance to next button
}
else
{
F32 dt = CurrentTime() - m_tStart;
if (dt < m_dtMax)
{
// Sequence is still valid.
m_iButton++; // advance to next button // Is the sequence complete?
if (m_iButton == m_buttonCount)
{
BroadcastEvent(m_eventId);
m_iButton = ; // reset
}
}
else
{
// Sorry, not fast enough.
m_iButton = ; // reset
}
}
}
}
};

31、Thumb Stick Rotation

  为了判定摇杆是否绕了一圈,可以将摇杆区域分割为4块,如果产生了 TL、TR、BR、BL序列,则判定摇杆绕了一圈。
  Game Engine Architecture 8

32、

33、

34、

35、