Скачать презентацию Advanced Graphics Performance COMP 9018 — Advanced Graphics Скачать презентацию Advanced Graphics Performance COMP 9018 — Advanced Graphics

bb654b8b5881b7774aabb7fcd9cfc9af.ppt

  • Количество слайдов: 42

Advanced Graphics: Performance COMP 9018 - Advanced Graphics Advanced Graphics: Performance COMP 9018 - Advanced Graphics

Performance Optimisation in Open. GL • Some quotes on optimisation: Performance Optimisation in Open. GL • Some quotes on optimisation: "We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. ” - Donald Knuth "Rules of Optimization: Rule 1: Don't do it. Rule 2 (for experts only): Don't do it yet. " - M. A. Jackson COMP 9018 - Advanced Graphics

But in graphics. . . • Frequently, performance is critical to utility/value. • Working But in graphics. . . • Frequently, performance is critical to utility/value. • Working on the edge of the possible. • So may have to optimise. • Systems are engineered for optimisation. COMP 9018 - Advanced Graphics

Making things run faster • Approaches to optimisation – – – – Use faster Making things run faster • Approaches to optimisation – – – – Use faster hardware Right data in the right place at the right time Getting rid of redundant calculations Tricking the eye ("close enough is good enough") Trading space for time Not drawing (elimination of what wouldn't be seen anyway) Writing it in assembly/C (but very rarely, usually last) COMP 9018 - Advanced Graphics

Hardware acceleration • Can be a good option. • Problem: Price-performance curve is exponential Hardware acceleration • Can be a good option. • Problem: Price-performance curve is exponential COMP 9018 - Advanced Graphics

More on hardware acceleration • Implication: It's easy to get very good performance using More on hardware acceleration • Implication: It's easy to get very good performance using hardware accel, but it gets extremely expensive when trying to obtain excellent performance. • Don't forget Moore's law. • Interaction between long development times & Moore's law means sometimes problem "fixes itself". COMP 9018 - Advanced Graphics

Right data in the right place • One of the best techniques • The Right data in the right place • One of the best techniques • The basis of caching • Exploits “locality” -- likely to reuse the same information again and again • Two types: – Temporal – Spatial COMP 9018 - Advanced Graphics

Another way to think of Open. GL • Open. GL can be thought of Another way to think of Open. GL • Open. GL can be thought of as a clientserver architecture • Some examples of client-server: – The Web – X windows • When did we ever say that the client and server were on the same machine? • Open. GL can run on a network COMP 9018 - Advanced Graphics

Client-server concept • • The program that makes API calls is the client The Client-server concept • • The program that makes API calls is the client The Open. GL implementation is the server The client sends requests to the server Client and server may be different machines - e. g. client is big mainframe spewing Open. GL commands; server is a PC with hardware acceleration • Still convenient to think of as client = my program, server = OS/driver/graphics card COMP 9018 - Advanced Graphics

Client-server concept • Client-server concept is still useful on a single machine. • Intuition: Client-server concept • Client-server concept is still useful on a single machine. • Intuition: Client is your program, server is your graphics card • Why is it useful concept? Important from a performance point of view. Different performance if data is stored at client or server. COMP 9018 - Advanced Graphics

Right place at the right time • • This is where the client server Right place at the right time • • This is where the client server stuff comes in. Now have graphics cards with 512 MB on board. What use is it? Once data is on the graphics card, everything is faster. • Problem: Once it's on the graphics card, it can't (easily) be modified. COMP 9018 - Advanced Graphics

Display lists • A very simple way to speed up Open. GL. • Idea: Display lists • A very simple way to speed up Open. GL. • Idea: Take almost any sequence of Open. GL commands, and package them up; then you can use them like macros. • Other libraries have similar concepts. e. g DX has "execute buffers". COMP 9018 - Advanced Graphics

When and why • Why? – Convenience: give something akin to a function calling When and why • Why? – Convenience: give something akin to a function calling structure but more efficient. – Efficiency: hardware can optimise, reduces function call overhead, data can live on the graphics card • When? – What you want to render is unlikely to change – When you are reusing structure – When you need speed COMP 9018 - Advanced Graphics

Initialisation • 3 steps: Initialise, define, use. • Get a display list ID (actually Initialisation • 3 steps: Initialise, define, use. • Get a display list ID (actually an int) using gl. Gen. Lists(size) • Can request more than one list at a time. • Returns an int you can use. Return 0 if none available COMP 9018 - Advanced Graphics

Definition • • • Like gl. Begin() and gl. End() gl. New. List(index, GL_COMPILE). Definition • • • Like gl. Begin() and gl. End() gl. New. List(index, GL_COMPILE). . . code for rendering things. . . gl. End. List(); Instead of GL_COMPILE, can be GL_COMPILE_AND_EXECUTE COMP 9018 - Advanced Graphics

Use • To render stuff, use gl. Call. List(index) • IMPORTANT NOTES: – Almost Use • To render stuff, use gl. Call. List(index) • IMPORTANT NOTES: – Almost anything can go in a display list: matrix ops, material defs, textures, geometry, lights, whatever. . . – Display lists COPY data: you can't modify the data once it's in a display list, even if it's a reference (i. e. e. g. if you use gl*fv(object), it won't notice when object changes). – Display lists affect and are effected by the current matrix stack values!! COMP 9018 - Advanced Graphics

What CAN'T you call for a DL • Some things not allowed: – Anything What CAN'T you call for a DL • Some things not allowed: – Anything that asks about the current state. – Anything that changes the rendering mode. – Anything that makes or deletes a list (but calling another display list is fine - can use this to build a hierarchy) COMP 9018 - Advanced Graphics

Code example • Look at nodisplaylist. c vs displaylist. c • Conclusion – Likely Code example • Look at nodisplaylist. c vs displaylist. c • Conclusion – Likely to be much faster, since data lives on graphics card. – Not much effort. COMP 9018 - Advanced Graphics

Redundant calculations • Also very important optimisation technique. • Closely related to locality idea. Redundant calculations • Also very important optimisation technique. • Closely related to locality idea. COMP 9018 - Advanced Graphics

Redundant calculations • An example: Vertex arrays. • Consider rendering a cube in Open. Redundant calculations • An example: Vertex arrays. • Consider rendering a cube in Open. GL. 7 6 3 4 2 0 1 5 gl. Begin(GL_QUAD); gl. Vertex 3 f(x 0, y 0, z 0); gl. Vertex 3 f(x 1, y 1, z 1); gl. Vertex 3 f(x 2, y 2, z 2); gl. Vertex 3 f(x 3, y 3, z 3); gl. End(); gl. Begin(GL_QUAD); gl. Vertex 3 f(x 1, y 1, z 1); gl. Vertex 3 f(x 5, y 5, z 5); gl. Vertex 3 f(x 6, y 6, z 6); gl. Vertex 3 f(x 2, y 2, z 2); gl. End(); COMP 9018 - Advanced Graphics

Question • How many points are transformed and lit in previous rendering of cube? Question • How many points are transformed and lit in previous rendering of cube? • How many points would minimally have to be transformed and lit in previous rendering? • How much calculations are wasted? Answers: 24, 8, 67 per cent COMP 9018 - Advanced Graphics

Huge waste! • • Same calculations are repeated. How to solve? Use indexed face Huge waste! • • Same calculations are repeated. How to solve? Use indexed face set data structure. Consists of two lists: – A list of coordinates. – A list of polygons = a list of lists of vertex indices. COMP 9018 - Advanced Graphics

Cube example • float vertices[][] = {{x 0, y 0, z 0}, {x 1, Cube example • float vertices[][] = {{x 0, y 0, z 0}, {x 1, y 1, z 1}, {x 2, y 2, z 2}, . . . , {x 7, y 7, z 7}}; • int faces[][] = {{0, 1, 2, 3}, {0, 5, 6, 2}, . . . , {4, 5, 6, 7}}; • But what about other data, e. g. surface normals? • Need to store them too. COMP 9018 - Advanced Graphics

Problem: Needs API support • To do this efficiently, API needs to support such Problem: Needs API support • To do this efficiently, API needs to support such an approach. • Any good graphics API (e. g. Open. GL, DX 8, Inventor, VRML 97, etc) supports this. • Have various names. • In Open. GL, called a vertex array. COMP 9018 - Advanced Graphics

Using Vertex Arrays • Can have up to 6 different arrays, for: – – Using Vertex Arrays • Can have up to 6 different arrays, for: – – – Vertex coordinates Normals Colours Texture coordinates A few other funky ones: index, edge flag • Enable which ever arrays you need • gl. Enable. Client. State(GL_VERTEX_ARRAY) COMP 9018 - Advanced Graphics

Step 2 • After initialising, tell it where the data lives • e. g. Step 2 • After initialising, tell it where the data lives • e. g. gl. Vertex. Pointer(size, type, stride, vertices); • Size is number of values per vertex (typ. 2, 3 or 4) • Type = GL_FLOAT or whatever • Stride is for more funky stuff (e. g. interleaved arrays) • Similar calls for gl. Normal. Pointer, gl. Tex. Coord. Pointer etc COMP 9018 - Advanced Graphics

Step 3: Access the data • Lots of different ways to call. Simplest: gl. Step 3: Access the data • Lots of different ways to call. Simplest: gl. Array. Element(index). • Action depends on what's enabled, but let's say only vertex arrays are enabled. Then this looks up index in the last thing gl. Vertex. Pointer was called on (say x) and does gl. Vertex 3 f(x). • If normal arrays were enabled, (and normal for index was y) this would do: gl. Normal 3 f(y); gl. Vertex 3 f(x); • NOTE: belongs between gl. Begin, gl. End. COMP 9018 - Advanced Graphics

Bunches of indices • Can also give multiple points at once: use gl. Draw. Bunches of indices • Can also give multiple points at once: use gl. Draw. Elements(mode, count, type, indices). • Mode is GL_LINE, GL_POLYGON, etc. • Count is number of indices • Type is usually GL_UNSIGNED_INT • NOTE: Does NOT go between a gl. Begin/gl. End COMP 9018 - Advanced Graphics

gl. Draw. Elements • Functionally equivalent to: • gl. Begin(mode); for(i=0; i < count; gl. Draw. Elements • Functionally equivalent to: • gl. Begin(mode); for(i=0; i < count; i++) gl. Array. Element(indices[i]); • gl. End(); • gl. Draw. Range. Elements() is similar, but you specify a constrained range of indices. COMP 9018 - Advanced Graphics

What does Open. GL do? • Can cache previously transformed vertices • Can use What does Open. GL do? • Can cache previously transformed vertices • Can use gl. Draw. Range. Elements to help tell Open. GL what's going to change • gl. Draw. Elements can draw lots of objects. Example: if all polys have four vertices, then use GL_QUADS instead and can give list of 24 vertices. COMP 9018 - Advanced Graphics

Vertex Buffer Objects • • • “Right stuff at right time” Problem: Vertex arrays Vertex Buffer Objects • • • “Right stuff at right time” Problem: Vertex arrays are client side. How to speed up? Put vertex array on server side? What is the disadvantage? COMP 9018 - Advanced Graphics

Open. GL 2. 0 • Vertex buffer objects – very new. • Idea: Push Open. GL 2. 0 • Vertex buffer objects – very new. • Idea: Push vertex array to server • Simple to use: – gl. Gen. Buffers() to ask for a buffer – gl. Bind. Buffer() to make it the current context – gl. Buffer. Data() specifies the data – Then use the usual commands COMP 9018 - Advanced Graphics

gl. Buffer. Data • gl. Buffer. Data(target, size, data, usage) • Most are obvious, gl. Buffer. Data • gl. Buffer. Data(target, size, data, usage) • Most are obvious, but usage? – Used to control how buffer gets treated – Static vs stream vs dynamic – Read vs Copy vs Draw • Can also use gl. Map. Data to modify the data COMP 9018 - Advanced Graphics

Code Example • vertexarray. c • Note: can mix and match normal with vertex Code Example • vertexarray. c • Note: can mix and match normal with vertex arrays. COMP 9018 - Advanced Graphics

Practical implications • You CAN use display lists and vertex arrays at the same Practical implications • You CAN use display lists and vertex arrays at the same time, but it's a bit tricky. • When you change data in a vertex array, and render immediately, that's fine. But with a display list, the data is copied. • Example: Say you have a creature with constantly moving body. Can't use a a display list. • But can use, for say, a helmet; or a head. COMP 9018 - Advanced Graphics

Space-time tradeoff • Sometimes, can use more space to make algorithm faster or vice Space-time tradeoff • Sometimes, can use more space to make algorithm faster or vice versa. • E. g. can sometimes precompute values if they will be reused alot. • Trading space for time example: precomputing sin/cos tables. • Trading time for space example: compressed textures (but really still about time). COMP 9018 - Advanced Graphics

Tricking the eye • Lots of examples in what you've already studied. • E. Tricking the eye • Lots of examples in what you've already studied. • E. g. Gouraud shading is nonsense theoretically. • Strictly Gouraud shading should be perspectivecorrected. • Not noticeable for Gouraud, but IS noticeable for texture maps. COMP 9018 - Advanced Graphics

Not rendering things • Back face culling: not drawing polygons facing away from us. Not rendering things • Back face culling: not drawing polygons facing away from us. • Easy to enable in Open. GL: gl. Enable(GL_CULL_FACE) • But lots of other examples: e. g. using visibility trees (similar to BSP trees) and portal systems to cut back on polygons. Any coincidence games are indoors? (more later) • Also the multires stuff and LOD (more later) COMP 9018 - Advanced Graphics

Rewriting code • Usually the last resort. • Usually the big gains are in Rewriting code • Usually the last resort. • Usually the big gains are in algorithmic improvement, not rewriting code more efficiently or re-implementing in C/Assembly. • Assembly less significant with RISC processors. • Very time consuming both initially and longterm. COMP 9018 - Advanced Graphics

Profiling • Profiling is analysing software as it runs to see how much time Profiling • Profiling is analysing software as it runs to see how much time executing different parts of code. • General observation: 90 per cent of time spent executing 10 per cent of code. • Pointless optimising wrong thing. • Example: Say you improve code outside top 10 per cent by 100 per cent. Will only make program run 5 per cent faster. COMP 9018 - Advanced Graphics

Bottlenecks • Profiling frequently reveals the bottleneck (the thing that slows everything down). Type Bottlenecks • Profiling frequently reveals the bottleneck (the thing that slows everything down). Type of bottleneck suggest solution. • Typical bottlenecks: – Fill-limited: Rasterising/texturing polygons. Occurs with software renderers. – Geometry-limited: Calculations of geometry. Too many polygons/vertices. – Client-side limited: Calculations on client side (e. g. of vertex/texture coordinates). Code optimization? Maybe COMP 9018 - Advanced Graphics

Demo • Profiling COMP 9018 - Advanced Graphics Demo • Profiling COMP 9018 - Advanced Graphics