"Windows Game Programming Master Skills" (Second Edition) Chapter 11

zhaozj2021-02-08 523

Part III: Core Game Programming Chapter 11 Algorithm, Data Structure, Memory Management and Multi-Thread Chapter 12 Artificial Intelligence Chapter 13 Game Physics Chapter 14 Text Era Chapter 15 Comprehensive Application: Writing Games!

Chapter 11 Algorithm, Data Structure, Memory Management and Multithreading "You Think I Can Get A Hug After this?" - Bear, Armageddon (Movie "End of the World) This chapter will discuss how often it is often omitted in other game programming reference books. Detail problem. We will involve all contents of the game, the production, optimization theory, etc. This chapter will help you master these necessary programming details. In this way, when we discuss artificial intelligence in the next chapter, you have a good concept of some game programming, and even 3D operations can no longer be difficult to fall!

The main contents of this chapter are as follows: • Data structure? Algorithm analysis? Optimization theory? Mathematical operation skills? Mixed language programming? Game preservation? Implementation of multi-thread programming technology data structure "What data structure should be used?" This is almost what I am most often asked. The answer is: the fastest and most efficient data structure. However, in most cases, you don't need to adopt those so-called most advanced and most complex data structures. Instead, you should simplify it as much as possible. Today, which is more important than memory, we would rather sacrifice memory!

Remember this, let's first look at several data structures that are most commonly used for games and give you and how to use these data structures.

Static structure and array

The most basic form of the data structure is of course a form of a data item, such as a structure or class. As shown below: Typedef struct player_typ // Tag for forward references {int stat; // position of player int x, y; // position of player // ...} player, * player_ptr; c in C , you don't have to Use typedef to define a structural type; when you use the keyword structure, the compiler will automatically create a type. In addition, C Struct can even contain methods, public, and private parts.

Player Player_1, Player_2; // CREATE A Couple Players In this example, a data structure with two static definitions is solved. On the other hand, if the game player is more than three, then a better approach is to use the array as follows: Player Players [Max_Players]; // The Players of The Game You can use a simple loop to process All game players. Okay, very good, but if you don't know how many players or record parameters will be in the game, what should I do? When this happens, we should calculate the maximum number of elements of the array may have. If the number is small, such as 256 or less, and each array element is quite small (less than 256 bytes), I usually use a static allocation of memory, and use a counter to calculate a time activated element. number. You may think that this is a waste of memory. But it is much easier and fast than traversing a chain list or dynamic structure. The key is that if you know the number of array elements before the game is running, and the number is not too big, then the memory is stored in advance by calling the function malloc () or new () when the game is started. WARNING Do not add static arrays! For example, if you have a size of 4KB, and there may be 1 to 256 arrays of this structure type. In order to prevent the number of the array elements from reaching 256, an overflow error is generated, and the 1MB of memory must be assigned when using a static allocation memory method. At this time, you clearly need a better way - use the chain list or dynamically allocated array to allocate memory to avoid waste. The linked list is the most appropriate processing method for those, simple data structures, and arrays that can be estimated when the program started or compiled. However, for data structures that may increase or shrink at runtime, the data structure processing method in which a linked list should be used. Figure 11-1 shows a standard, abstract linked table structure. A linked list consists of many nodes. Each node contains information and pointers that point to the next node in the table.

Figure 11-1: A linked list is very cool because you can insert a node into any location of the linked list, and you can also delete the node of any location. Figure 11-2 illustrates the situation of the node inserting the linked list. Since the node with information can be inserted or deleted during runtime, the data structure of the game program is attractive.

Figure 11-2: Inserting a node linked list in the linked list is that you must take a traversal node to find your target node (unless you create a second data structure to help queries). For example, suppose you want to locate the 15th element in an array, you can access it with this: Players [15]. But for the linked list, you need a node that traverses the algorithm to access the linked list to locate the target node. This means that in the worst case, the number of repetitions of the query is equal to the length of the linked list. This is O (n). O Nogo Describes to perform an operation with the n-same order in the case of N elements. Of course, we can use the optimization algorithm and the additional data structure containing the sorting index to achieve almost simply faster speeds with the access array.

Creating a chain manifests When you look at how to create a simple linked list, add a node, delete a node, and search for data items with given keywords. Below is a definition of a basic node: typedef struct node_typ {int id; // id Number of this object int Age; // agent orthom [32]; // name of person node_typ * Next; // this is the the the Link to the next node // more fields Go Here} node, * node_ptr; To access a linked list, a Head pointer and a TAIL pointer point to the head node and tail node of the linked list, respectively. At the beginning, the chain list is empty, and thus the head pointer points to null: node_ptr head = null, tail = null; Note Some Program WIKE To Start Off A Linked List with a Dummy Node That's always Empty. This is mostly a choice of taste. HoWever, this changes some of the initial condition, and deleion algorithms, so you might want to try it. Some programmers like to always empty dumb nodes (ie nodes that do not represent actual data) As a start node of a linked list. This is usually personal habit. But this will affect the initial conditions in the chain table node creation, insert, and delete the algorithm, you may wish to try it. The traversal chain list is surprising that the traversal chain list is the most easily implemented in all linked works.

1. Start from the HEAD pointer. 2. Access node. 3. Link to the next node. 4. If the pointer is non-NULL, repeat steps 2, 3. Here is the source code: void traverse_list (node_ptr head) {// this function traverses the linked list and prints out // EACH NODE

// Test if Head is nullif (head == null) {Printf ("/ nlinked list is empty!"); return;} // end if

// traverse while nodeswhile (head! = null) {// visit the node, print it out, or wherever ... printf ("/ nnode data: id =% d", head-> id); printf ("/ NAGE =% D, Head-> AGE); Printf ("/ nname =% s / n", head-> name);

// advance to next node (simple!) Head = head-> next;} // end while

Print ("/ n");

} // end traverse_list is cool? Next, let's take a look at how to insert a node in the linked list.

The first step in the insertion node insertion node is to create the node. There are two ways to create nodes: you can pass new data elements to the insert function, constructed by this function; or construct a new node first, then pass it to the insert function. These two methods are essentially the same. In addition, there are many ways to implement the node to insert a chain list. The special fashion is to insert the node to be inserted in the beginning or end of the linked list. If you don't care about the order in the chain list, this is a convenient way. However, if you want to keep the list of linked lists, you should use a smarter insertion algorithm so that the linked list after the insertion node remains ascending or descending. This can also make the search time faster. For the sake of simplicity, I will give an example of the simplest node insertion method, that is, insert the node at the end of the linked list. In fact, the sequential node insertion algorithm is not very complicated. First scan the entire list, find the location you want to insert in the new node, and insert it. The only problem is to ensure that you do not lose any pointers and information. The following source code inserts a new node into the tail of the linked list (slightly more difficult than inserted the header header). Pay attention to special situations, namely empty chain tables and only one element with a list: // Access the global head and tail to make code easier // in real life, you might want to use ** pointers and // modify Head and tail in THE FUNCTION ??? Node_ptr insert_node (int ID, int agent_node) {// this function inserts a node at the end of the listNode_ptr new_node = null;

// Step 1: create the new nodenew_node = (node_ptr) malloc (sizeof (node)); // IN C USE New Operator

// fill in fieldsnew_node-> id = id; new_node-> age = age; strcpy (new_node-> name, name); // Memory Must Be copied! new_node-> next = null; // good practice

// Step 2: What is the current state of the linked list?

IF (Head == Null) // Case 1 {// Empty List, SimpleSt Case Head = Tail = New_Node;

// Return New Node Return (NEW_NODE);} // end ifelseif ((Head! = null) && (Head == Tail) // Case 2 {// there is exactly one element, Just a little // finess. .. head-> next = new_node; tail = new_node;

// Return New Node Return (new_node);} // end ifelse // case 3 {// thereid or more elements in list // // the new node tail-> Next = new_node; tail = new_node;

// return the new node return (new_node);} // end else} // end Insert_NodeAs you can see, the code is rather simple But it's easy to mess up because you're dealing with pointers, so be careful Also.! .......... of the code. Now let's remove a node. Now Let's remove a node. Now Let's Remove a node. You may feel that the code is relatively simple. But in fact it is very easy to cause confusion, because you deal with the pointer, so be careful! Clever programmers will quickly realize that Case 2 and Case 3 can be combined, but the code here is more readily readable. Let's take a look at the deletion of the node.

Deleting Nods Delete Nods are complicated than insertion nodes because pointers and memory are repositioned and assigned. In most cases, just delete a specified node. But the location of this node may be head, tail or middle, so you must write a generic algorithm to process all possible situations. If you don't take all the circumstances to take into account and test, then the consequences will be very bad! In general, this algorithm must be able to search on a given keyword search list, delete the node and release the memory it takes up. Further, the algorithm must also be able to repair a pointer to the pointer to the deleted node and the next node pointed to by the node. Take a look at Figure 11-3 will be at a glance.

Figure 11-3: Deleting Nodes from the Link This section can be based on the keyword ID, complete the operation of deleting any node: // Again this function will modify the globals // Head and tail (Possibly)

INT delete_node (int id) // node to delete {// this function deletes a node from // the linked list given its idnode_ptr curr_ptr = head, // used to search the list prev_ptr = head; // previous record

// Test if there is a linked list to delete fromif (! head) return (-1);

// traverse the List and find node to deletewhile (curr_ptr-> id! = id && curr_ptr) {// save this position prev_ptr = curr_ptr; curr_ptr = curr_ptr-> next;} // end while

// at this point we have found the node // or the end of the listi (curr_ptr == null) Return (-1); // couldn't Find Record // Record Was Found, So Delete It, But Be Careful , // NEED TO TEST CASES // Case 1: One Elementif (Head == Tail) {// delete node free (head);

// fix up pointers head = tail = null; // return iD of deleted node return (id);} // end ifelse // case 2: front of listif (curr_ptr == head) {//move head to next node HEAD = head-> next;

// delete the node free (curr_ptr);

// Return ID of Deleted Node Return (ID);

} // end ifelse // case 3: end of listif (curr_ptr == tail) {// fix up previous pointer to point to null prev_ptr-> next = null;

// delete the last node free (curr_ptr);

// Point Tail to Previous Node Tail = prev_ptr;

// Return ID of Deleted Node Return (ID);

} // end ifelse // case 4: Node is in middle of list {// connect the prep in to the next node prev_ptr-> next = curr_ptr-> next

// Now delete the current node free (CURR_PTR);

// Return ID of Deleted Node Return (ID);} // END ELSE

} // End delete_node Note The code includes many special situations. Although the processing of each case is very simple, I still want to remind the reader, be sure to consider, do not let every possible situation. Finally, you may have noticed that the operation of deleting the internal nodes of the list is extremely dramatic. This is because this node is deleted, it cannot be recovered. We have to track the previous Node_PTR to track the nodes at the end. You can use the two-way linked list as shown in Figure 11-4 to solve this problem and other similar issues. Figure 11-4: The advantage of the bidirectional linked list is that you can traverse the chain table nodes in any location, which can easily implement the insertion and deletion of nodes. On the data structure, the only change is to add a link field, as shown below: TypeDef struct node_typ {int id; // id Number of this Object int Age; // agent inpera [32]; // Name of Person // More Fields Go Here Node_Typ * Next; // Link to the next node node_typ * prev; // link to previous node

} Node, * node_ptr; Apply two-way linked list, you can search from any node forward or backward, so the tracking node operation related to the node insertion and deletion is greatly simplified. The console program demo11_1.cpp | EXE is a simple list of monolithic operations. It enables insertion nodes, deletes nodes and traversal layers. Note Demo11_1.cpp is a console program instead of a standard Windows .exe program. Therefore, the compiler should be set to the console application before compiling. Of course, there is no use of DirectX here, so you don't have to load any DirectX .lib file.

Algorithm analysis algorithm design and analysis is usually advanced computer scientific knowledge. But we must at least master some common sense skills and ideas, so that we have written complicated algorithms. First, you have to know that a good algorithm is better than all assembly languages or optimizers. For example, in the foregoing, adjust the order of data to reduce the time taken by an amplitude search data. Therefore, the principle that should be mastered is to select an algorithm for reliable, suitable issues, and data, and at the same time, it is also necessary to pick a data structure that is easy to access and processed by this algorithm. For example, if you always use a linear array, you don't expect to be able to perform a search for linear search time (unless you use the second data structure). But if you use the array of sort, the search time will be shortened to a logarithmic level. The first step in writing a good algorithm is to master some algorithm analysis knowledge. Algorithm analysis technology is also a gradual analysis (asymptotic analysis), usually based on calculus. I don't want to deepen too much, so I only introduce some concepts. Analysis of the basic idea of an algorithm is to take a look at the number of executions of the primary cycle when n elements. Here n can represent the number of elements of any data structure. This is the most important idea of algorithm analysis. Of course, after having a good algorithm, each execution time, initialized system overhead is equally important, but we start with the number of cyclic executions. Let's take a look at the following two examples: for (int index = 0; index {// DO Work, 50 cycles} // end for index In this example, the program performs N (= 50) cycle, so execution time It is N-class o (n) .O (n) called a large O remember, it is an upper limit value, which is also a very rough upper estimate of the execution time. If you want more accurate, you know the internal Calculation requires a 50 cycle, so the entire execution time is: N * 50 cycles is wrong! If you want to calculate the cycle time, you should include the time spending the cycle itself. These time includes the initialization, comparison, increment of variables. And cycling jump. Add these times, as shown in the following formula: Cyclesinitialization (50 cyclesinc cyclescomp cyclesjump) * n The above formula is a better estimate. Cyclesinc, cyclescomp and cyclesjump represent increment, respectively. Compare and jump, the number of weles need. On the Pentium-level processor, its value is about 1 to 2 weeks. Therefore, in this case, the time spent in the cycle itself and the internal work cycle required for the program. Time is more important. For example, many game programmers write it into a function rather than a macro or inline code when handling the problem with the pixel drawing. Because the pixel drawing function is so simple, Calling this function is more than the time required for direct picture! So you must ensure that there is enough work within the circulation during design cycles, and the time required to cycle run is much greater than the time spent on the cycle itself.

Now let's take a look at another example - its time complexity is higher than N: // Outer loopfor (i = 0; i

{// inner loop for (j = 1; j <2 * n; j ) {// do work} // end for j} // end for i In this example, we assume that the work part of the work in the loop is executed Time is much larger than the time spent on the recycle mechanism, so that we don't consider the number of cyclic executions only considering the time spending time. The number of external cycles of this program is N times, the number of internal cycles is 2N-1, so the total number of internal code is: n * (2 * n-1) = 2 * N2-N above is composed of two configurations . 2N2 is the main item, its value is much larger than the latter, and the difference between the two items is also increased with the increase of N value. As shown in Figure 11-5. Figure 11-5: 2N2-N growth rate When N is less like n = 2, the value of the above formula is: 2 * (2) 2 - 2 = 6 In this case, n is minus total cost 25% of time. However, when N is increased, for example, n = 1000: 2 * (1000) 2 - 1000 = 1999000, n this only reduced 0.5% of total spending time, which can be ignored. The reader now understands why 2 * n2 is the main or more simply say N2 is the main item. So, at this time, the algorithm complexity is O (N2), which is very bad, and the algorithm with N2 operation cannot be satisfactory, so when you make such an algorithm, you will come back from the beginning! The above is prepared for progressive analysis. The minimum requirement is that you must be able to roughly estimate the execution time of your game program. This will help you choose the algorithm and the required coding space. Recursive us The theme of the next thing is recursion. Recursive is a technique for solving the problem of ways to apply. The basic meaning of recursive is to decompose many problems into simple problems in the same form until they can really solve. These small problems are then summarized, and the combination is then solved the entire problem. Is it very wonderful?

In the computer programming. We usually use a recursive algorithm to achieve search, sort, and some mathematical calculations. The premise of recursive is very simple: write a function, which has the ability to call yourself to solve the problem. Is it incredible? The key is that when the function calls himself, create a new set of local variables in the stack, so it is equivalent to a new function called. You need to note that the function cannot overflow the stack, and there must be termination conditions. At the end of the program, there must be an end processing code to ensure that the stack is released through return (). Let us see a standard example: the calculation of step bypass. A number of step-by-case writing n !,, its meaning is as follows: N! = N * (n-1) * (N-2) * (n-3) * ... (2) * (1) and 0! = 1! = 1thus, 5! IS 5 * 4 * 3 * 2 * 1. In this way, 5! Is 5 * 4 * 3 * 2 * 1. The following is a computing code written in a usual method: // Assume Integersint Factorial (int N) {int sum = 1; // Hold Result

// Accumulate ProductWhile (n> = 1) {SUM * = N;

// Decrement n n -;

} // End while

// Return The Resultreturn (SUM);

} // end fact is very simple. If 0 or 1 is input, the calculation result is 1. If you enter 3, the order is calculated as follows: sum = sum * 3 = 1 * 3 = 3sum = sum * 2 = 3 * 2 = 6sum = sum * 1 = 6 * 1 = 6 Obviously, the calculation results are correct. Because 3! = 3 * 2 * 1.

The following is a program written in a recursive method: INT factoral_rec (int N) {// test for terminal caseif (n == 0 || n == 1) Return (1); Else Return (N * Factorial_Rec (N-1) );

} // end factorial_rec This program is not much complicated. Let's see how the program runs when n is 0 and 1, and the program is running. In both cases, the first IF statement is TRUE, returns the value 1 and exits the program. But when n> 1, wonderful things happened. At this time, the ELSE statement is executed and returned to the function to call the value after it (N-1). This is the recursive process. The value of the current function variable is still saved in the stack. Next time the function is equivalent to calling a function of a set of new variables as parameters. The first return statement in the code cannot be executed before the next call is performed, so that it is loop until the end statement is performed. Let us take a look at N = 3, each iterates the actual value of the variable N. 1.

The first call function factorial_rec (3), the function starts executing the RETURN statement: Return (3 * Factorial_Rec (2)); 2.

The second call function factorial_rec (2), the function starts to execute the RETURN statement: Return (2 * factorial_rec (1)); 3.

The third call function factorial_rec (1), this function performs an end statement and returns a value 1: return (1); now wonderfully, 1 is returned to the second call function factorial_rec (), as shown below: Return: Return (2 * Factorial_Rec (1));

This also calculated Return (2 * 1);

This value is returned to the first call, as shown below: Return (3 * Factorial_Rec (2)); this also calculates Return (3 * 2); and Presto, The Function Can FINALLY return with 6-which is indeed 3! That's recursion. Now, you might ask which method is better-recursion or non-recursion? Obviously, the straightforward method is faster because there are not any function calls or stack manipulation, but the recursive way is more elegant and better reflects the problem. This is the reason why we use recursion. Some algorithms are recursive in nature, trying to write non-recursive algorithms is tedious, and in the end they become recursive simulations that use stacks themselves! So Use recursion if it's appropriate and simplifies the problem linear code. The function finally gets the return value 6 is 3! . This is a recursive process. What kind of method you will ask? - Is it recursive or non-recursive? Obviously, the direct method performs a quick speed because there is no function call or stack operation involving any function. But the recursive method is more elegant, and it can better reflect the nature of the problem. This is why we use recursive. Some algorithms are inherently recursive, and the write-to-hand category will be very lengthy, and eventually must be recursively simulated using the stack. If it is suitable for simplified issues, use recursive method; otherwise use a direct method. For example, look at the program demo11_2.cpp | exe. The program implements a step-by-step algorithm. Pay attention to the overflow speed of the step. Look at your machine to calculate how big! Most machines can be calculated to 69! Do not lie to you. Mathematics Let 's recursively realize the Philippinea FibonAcci algorithm. The nth fibonacci digital element Fn = Fn-1 Fn-2, additional, f0 = 0, f1 = 1, then F2 = f1 f0 = 1, f3 = f2 f1 = 2. Therefore, the entire Fibonacci sequence is: 0, 1, 1, 2, 3, 5, 8, 13 ... counting a number of seeds in the sunflower seeds, exactly the Fibonacci sequence.

The tree structure next we will discuss the high data structure: Tree structure. Usually people use the recursive algorithm to process the tree structure, so I am particularly mentioned in the above discussion. Figure 11-6 illustrates some different tree data structures.

Figure 11-6: Topology of trees inventive tree structure for storing and searching for massive data. The most common tree structure is a binary tree, also known as a B-tree or two-point lookup tree (BST). The tree structure is a tree structure that is dissipated from a single node, contains many child nodes. Each node can derive one or two child nodes (brothers nodes, sibling), and the binary tree is now named. And I can discuss the order of a binary tree, which is a total of several layers. Figure 11-7 shows a different level of binary tree. Figure 11-7: Some different lines of binary trees about the tree structure, interesting is the speed of information query. Most binary trees use a single keyword to query data. For example, suppose you want to create a binary tree containing the game object record, and each game object has multiple properties, you may use the creation time of the game object as a keyword, or define each node in the database as a representative . Below is a data structure that can be used to save a single person's node: typedef struct tnode_typ {int agent; // agent {c char name [32]; // name of person node_typ * right; // link to right node node_typ * ipeft ; // link to leave node} tnode, * tnode_ptr; pay attention to the similarities of the tree node! The only difference between them uses data structures and constructs trees. Take a look at the above, if there are five objects (people), the age is 5, 25, 3, 12, and 10, respectively. Figure 11-8 shows two different binary trees comprising the data. However, you can do better, such as in the insertion algorithm, maintain a certain property of the binary tree according to the data insertion order. Figure 11-8: Data collection age {5, 25, 3, 12, 10} bifurcoulation Note Of Course, The Data IN this Example Can Be Anything You want. Of course, in this case, the data can be defined as any value. .

Note When the binary tree as shown in Fig. 11.8 is used, the following convention is used: the right sub-node of any node is always greater than or equal to the node itself, and the left sub-node is always smaller than it itself. You can also use other conventions as long as you consistency. The binary tree can store massive data, and the "Different Finding Method" can quickly retrieve data. This is the superiority of the binary tree. For example, if a binary tree has a millions of nodes, you can retrieve the data for at most 20 times! Is this a great? The reason is that each iteration in the search is reduced by half the number of nodes in the search space. Fundamentally, if there is n nodes, then the average search number is log2 N, and the running time is O (log2 n). Note that the search time described above is only suitable for balancing the binary tree - that is, each level contains a two-and-right child node. If the binary tree is not balanced, then it is degraded into a linked list, and the search time is also degraded into a linear function.

The second very cool attribute of the B tree is that you can locate the subtree and process it separately. The subtree still has all the properties of the B tree. Therefore, if you know the location of the retrieval, you can search the sub-tree to search the things you need. In this way, you can create a tree of the tree, or create a substrus index table without having to handle the entire tree. This is very important in 3D scene modeling. You can build a tree structure for the entire game space, and in hundreds of sub-trees to express all rooms in the space. You can also create another tree structure to refer to a linked list that represents a sequence-arranged pointer. As shown in Figure 11-9. More contents in this regard will be in the next chapter of this book.

Figure 11-9: Using a secondary index table in the B tree structure, let us talk about when to use the tree structure. I suggest that the tree structure is used when the problem or data is similar to the tree structure. For example, when you draw a sketch of the problem, find that there is a left and right branch, the tree structure should be obviously appropriate. Establishing a two-point lookup tree (BST) Since the algorithm applied to the B tree is essentially recursive, the topic of this section is more complicated. Let us browse some algorithms of the binary tree first, then write some code, and complete a demos. Similar to the chain, there are two ways to create BST: the root node of the tree structure can be a dummy or an actual node. I prefer to use the actual node as the root node. Therefore, there is no thing in an empty tree, and this should point to the root node now is NULL. TNode_Ptr Root = NULL; OKAY, to insert the data into BST, you must determine the keywords used by the inserted data. In this case, you can use the age or name of the human. Since the age is used as a keyword, the age is also used as a key. However, the name is simpler as the keyword, you only need to use a dictionary comparison function such as strcmp (), you can determine the order of the name. Anyway, the following code plugs the node into bst: tnode_ptr root = null; // here's the initial treetnode_ptr bst_insert_node (tnode_ptr root, int id, int age, char * name) {// Test for Empty Treeif (root == null ) {// INSERT node at root root = new (tnode); root-> id = id; root-> agn = age; strcpy (root-> name, name);

// set links to null root-> right = null; root-> left = null;

Printf ("/ ncreating tree");

} // end if

// else there is a node here, lets go left or rightlseif (agn> = root-> age) {printf ("/ ntraversing right ..."); // Insert On Right Branch

// Test if branch Leads to another sub-tree or is terminal // if leads to another subtree kiln @ crete a node and link if (root-> right) bst_insert_node (root-> right, id , age, name); Else {// INSERT Node on right link tnode_ptr node = new (tnode); node-> id = id; node-> agn = age; strcpy (node-> name, name);

// set links to null node-> left = null; node-> right = null;

// NOW SET Right Link of current "root" to this new node root-> right = node; printf ("/ ninserting right.");} // ELSE

} // end ifelse // age age {printf ("/ ntraversing left ..."); // Must Insert On Left Branch

// Test if branch Leads to another sub-tree or is terminal // if Leads to another subtree the // Create a node and link if (root-> left) bst_insert_node (root-> left, id , agn, name); Else {// INSERT Node on LEFT LINK TNODE_PTR Node = New (TNODE); Node-> ID = ID; Node-> Age = Age; STRCPY (Node-> Name, Name);

// set links to null node-> left = null; node-> right = null;

// NOW SET Right Link of current "root" to this new node root-> left = node;

Printf ("/ ninserting left.");} // END ELSE

} // END ELSE

// Return the rootreturn (root);

} // end bst_insert_node First You have to test whether it is empty tree, if it is empty tree, create its root node. If necessary, use the content that is first inserted into BST to create a root node. Therefore, the content or record that is first inserted into the BST represents the combo space, so that the tree can balance well. In short, if the tree has more than one node, you must traverse the tree, as to insert the node into the left sub-tree or the right man tree depends on the record you want to insert. When you encounter the leaf node or termination branch of the tree structure, you can insert the new node into the location. Root = bst_insert_node (root, 4, 30, "jim"); Figure 11-10 illustrates how to insert "Jim" into a tree. Figure 11-10: Inserting an element in the BST Insert a new node into the execution efficiency of the process of this BST and the search node in the BST is quite, and therefore, the average time required for one insert operation is about O (log2n), and the worst The situation is O (n) (when the key is ranked in sequence linearly). Search BST once the BST is established, it is time to search for data. Of course, this will use many recursive methods to pay attention. Search BST has three methods:? Pre-sequence - access node, then search for the left subtree according to the preamble, and finally search the right tree in the preamble. ? Sedaining - Search the left subtree in the order, then access the node, and finalize the right tree in the order. ? Search for the left subtree in the back sequence, then search the right tree and finally the node. note

Left and right are arbitrary, the key is to access and search the order.

See Figure 11-11. The figure shows a simple binary tree and three search order. Figure 11-11: The node access order of the pre-sequence, order, and sequential search mode knows these three order, you can implement them with fairly simple recursive algorithms to achieve them. Of course, the purpose of traversing the binary search tree is to find the target and return it. The following function implements the function of the binary tree traversal. You can add a stop code to it for you to search for the target keyword. However, now you are more searching for: void bst_inorder_search (tnode_ptr root) {// this search a bst sale the inorder search

// Test for nullif (! root) return;

// Traverse Left Treebst_inorder_search (root-> left);

// Visit The NodePrintf ("Name:% S, Age:% D", root-> name, root-> age);

// traverse the right treebst_inorder_search (root-> right);

} // end bst_inorder_search

The following is the pre-serial search code: void bst_preorder_search (tnode_ptr root) {// this search a bst sale the preorder search

// Test for nullif (! root) return;

// Visit The NodePrintf ("Name:% S, Age:% D", root-> name, root-> age);

// Traverse Left Treebst_inorder_search (root-> left);

// traverse the right treebst_inorder_search (root-> right);

} // end bst_preorder_search

The following is a sequential search code: void bst_postorder_search (tnode_ptr root) {// this search a bst sale the postorder search

// Test for nullif (! root) return;

// Traverse Left Treebst_inorder_search (root-> left);

// traverse the right treebst_inorder_search (root-> right);

// Visit The NodePrintf ("Name:% S, Age:% D", root-> name, root-> age);

} // end bst_postorder_search

It's so simple, like a trick? Suppose you have built a binary tree, trying to use this call to conduct a sequential traversal. Bst_inorder_search (my_tree); Tips I am difficult to describe how important the tree structure is in 3D graphics, I hope you have understood the above. Otherwise, when you construct a binary space partition (????) to resolve the rendering problem, you will fall into the bitter sea of the pointer. :)

You noticed how I omitted how to delete a node. I am intended to do this. Because it is very complex by deleting a node, it is possible to destroy the parent node of the subtree and its connection between the child node. I think, deleting a node will be able to leave a practice to readers. Here I recommend a good data structure reference book - written by Sedgewick, "Algorithms IN C " algorithm I ~ IV (C implementation) - foundation, data structure Sort and search (third edition) "has been published by China Electric Power Press. - Editor), this book discusses the tree structure and its related algorithm. Finally, the reader can check an example program of the binary search tree - Demo11_3.cpp | EXE. The program allows you to create a binary search tree and use three algorithms to traverse it. This is also a console program, so it needs to be compiled correctly. Optimization Theory is more emphasis on performance optimization than writing any other programs. Video games will continue to break through the limits of hardware and software. Game programmers always want to join more biological, special effects, sound, and better intelligence. Therefore, optimization is critical. In this section, we will discuss some optimization techniques to help you play game programming. If you have a strong interest in this, there are many books on this aspect. If you are published by Addison Wesley Press, Rick Booth is edited by "Inner Loops", published by the Coriolis Group, Mike Abrash "Zen of Code Optimization" , AP Press, "Pentium Processor Optimization" edited by Mike Schmit.

The first to write optimized code using your mind is to understand the compiler and data structure, and how you prepared the C / C program is eventually converted to executable machine code. Basic ideas are using simple programming techniques and data structures. The more complex your code, the more designed, the more difficult the compiler transforms it into the machine code, so the slower execution is slower. The following is the basic principles that need to be followed when programming: • Use 32-bit data as much as possible. The 8-bit data has fewer space, but the Intel's processor is based on 32-bit and optimized for 32-bit data. • For frequently called small functions, it should be declared as inline functions. • Use global variables as much as possible, but avoid code of readability. • Avoid using floating point numbers for addition and subtraction operations because integer units are usually faster than floating point units. • Use an integer as much as possible. Although floating-point processors are almost as fast as integers, the integer is more accurate. So if you don't need an accurate decimal place, an integer is used. • Adjust all data structures to 32 bytes aligned. On most compilers, you can use the compiler to indicate words to complete, or use #pragma in the code. Unless it is a simple type of parameters, the parameters are transmitted as much as possible. The pointer should be used. Do not use the Register key in your code. Although Microsoft claims that it can speed up the loop, this will cause the compiler without a sufficiently available register, resulting in generating a bad code. • If you are a bit C programmer, use class and virtual functions. But the inheritance of the software does not have too much. • Pentium-level processors use an internal data and code cache. After you understand this, make sure that the function code is short to adapt to the size of the cache (16kb to 32KB). Further, when the data is stored, it will be stored in the manner it will be accessed. This improves the hit rate of the cache, thereby reducing the number of access to the main memory or secondary cache. It should be noted that the access speed of the main memory or secondary cache is 10 times greater than the access speed of the internal cache. • The Pentium-level processor has a kernel similar to the RISC, so it is good at handling simple instructions and can handle more than two instructions simultaneously in several execution units. Therefore, it is not recommended to write a bad code in a line of code, and it is best to write a simpler code. Even if you can merge these code into a line of code with the same function, it is also modified as simple.

Mathematics Tips Since a large number of jobs in game program is essentially mathematical, it is very worthwhile to understand how to perform mathematical operations. There are many common techniques and methods to be utilized to improve program running speed: the number involved in the integer operation must be an integer, and the number involved in floating point operation must be a floating point number. Type conversion inevitably reducing performance. So don't make type conversion unless it is forced. The multiplication operation of the integer and 2 can be achieved by the left shift operation. Similarly, the division of the integer and 2 can be achieved by the right shift operation. The multiplication and reload of the number of integers and non 2 powers can be converted into and operational or differences. For example, 640 is not 2 integers power, but 512 and 128 are 2 integers; so the best way to multiply a certain integer and 640 is to perform the following transformation: Product = (n << 7) (n << 9); // n * 128 n * 512 = n * 640 However, if the processor can complete the multiplication operation within 1 to 2 clock cycles, this optimization is losing. If your algorithm is applied to a matrix operation, make full use of sparsiness of the matrix - some elements are zero. When creating a constant, make sure it is an appropriate type to prevent compiler compilation errors or to convert them to an integer type. It is best to use a new Const point word in C . For example: const float f = 12.45; avoid using a square root, triangle function or any complex mathematical function. In general, the calculations of these complex functions can be achieved in accordance with a particular hypothesis or approximately to achieve a simple method. But even if the results are still not ideal, you can still make a lookup table. I will mention the table later. If you want to clear a large floating point number, you should use MemSet (): Memset ((void *) FLOAT_ARRAY, 0, SIZEOF (FLOAT) * NUM_FLOATS); however, only this situation can only be This, because the floating point is encoded in the IEEE format, so only the value of integer zero and floating point zero is completely equal. When performing a mathematical operation, see if you can handle the expression before the encoding will simplify the expression. For example, n * (f 1) / n is equal to (f 1), as the division and multiplication calculation results will be eliminated. If you want to perform complex mathematical calculations, and you need to use the calculation results again in the code below, then you temporarily there is a cache. As follows: // compute term that is used in more than one expressionfloat n_squared = n * n; // use term in two different expressionspitch = 34.5 * n_squared 100 * rate; magnitude = n_squared / length; Finally, it is important One point is to ensure that the selection of the compiler is set to open the floating point coprocessor so that the execution speed of the code is faster.

Fixed point operation

A few years ago, most 3D engines use a fixed-point operation to perform a large number of transformations and mathematical operations in 3D. This is because the processor's support for floating point does not support the integer, even on the Pentium processor. But today, Pentium II, III and KATMAI processors have better floating power, so the fixed-point operation is no longer so important. However, in many cases, the grating transformation from floating point to integer is still very slow, so the use of fixed-point operations in the internal cycle and subtraction operations are still a good choice. On the low-end Pentium machine, this type of operation is still faster than the floating point operation, so you can use skills to quickly extract an integer portion from the fixed point, without having to carry out the type conversion of float to INT. In any case, these have certain uncertainty. Today, using floating point to deal with any operations are usually the best choice, but more about some fixed-point operations are still helpful. My point is to use floating point to process all data representations and transformations. For low levels of grating transformations, they can trial a test point operation and floating point operation, and see the fastest processing method. Of course, if you use pure hardware acceleration, you don't need to use the floating point number. Remember these, let's take a look at how to indicate the number of fixed points.

The number of fixed points indicates that all fixed-point mathematics is actually based on integer scale. For example, we want to use an integer to represent 10.5. This can't be done because there is no small number. You can cut it to 10.0 or round it to 11.0, but 10.5 is not an integer. But if you turn 10.5 to 10 times, 10.5 becomes 105.0, this is an integer. This is the foundation of the fixed point. You can use a proportional factor to scale the value and take into account the proportional coefficient when making mathematical calculations. Since the computer is binary, most gaming programs tend to use 32-bit integers (or int), indicating the number of fixed points at 16.16. As shown in Figure 11-12.

Figure 11-12: A 16.16 fixed point representation You can put an integer portion at a high 16-bit, and the fractional portion is set to 16 bits. This way you have already enlarged the entire number of 216, 65536 times. In addition, in order to extract an integer part of a fixed point, you can shift and shield the height of 16 bits; to get the fractional portion, you can shift and shield the low 16 bits. The following is some of the common type of fixed point: #define fp_shift 16 // Shifts to produce a fixed-point number # define fp_scale 65536 // scaling factory

Typedef int fixpoint;

Conversion between fixed-point and floating and floating and floating and floating point requires conversion to fixed point: integer and floating point number. These two types of transitions are different and need to be considered separately. For integers, use their binary complement. So you can shift your operation to zoom in to zoom in to convert it to a fixed point. For floating point numbers, due to its use of IEEE format, there is an mantissa and index in the four byte, so shift will destroy this number. Therefore, the standard floating point multiplication must be used for conversion. Mathematical binary complement is a method representing binary integers. Therefore, all numbers and negative numbers can be represented by this method, and mathematical operations are correct in the collection. A binary number of complement is the number of retrofitting of each bit of the binary and a number of arithmetic calculations. In mathematical sense, assume that you want to ask 6 complement, first in the case of -6, then add 1 6 1 or ~ 0110 0001 = 1001 0001 = 1010 with 1. The number is the general binary representation of 10, but it is also the complement of -6.

The following is a macro that converts an integer into a fixed point: #define int_to_fixp (n) (((n << fp_shift))), for example: fixpoint speted = int_to_fixp (100); the following is a macro that converts floating point numbers to fixed points. Macro: #define float_to_fixp (n) (FLOAT N * fp_scale)), for example: fixpoint speed = float_to_fixp (100.5); extracting a fixed point is also very simple. The following is a macro from the high 16-bit extraction integer part: #define fix_int_part (n) (n >> 16) As for the extracted fraction from the lower 16-bit, simply shielding the integer portion: #define fix_dec_part (n) (N & 0x0000FFFF) Of course, if you are smart, you can do not need to convert, just use the pointer to instantly access the height 16 bits and low 16 digits. As shown below: FixPoint fp; short * integral_part = & (fp 2), * decimal_part = & fp; pointer integral_part and decimal_part always point to the 16-digit of your required. Accuracy This moment suddenly appeared in your mind: What is the binary fraction? Usually you don't have to pay attention to this problem, it is only used in operation. In general, only an integer portion is required in a grating conversion cycle or other cycle. However, since 2 is the base, the fractional portion is also a decimal number of 2 is the calculation. As shown in Figure 11-12. For example, binary decimal 1001.0001 is 9 0 * 1/2 0 * 1/4 0 * 1/8 1 * 1/16 = 9.0625 This brings us accurate concepts. The four-digit binary number is used in the formula, which is the same as the 1.5-bit decimal accuracy, or ± 0.0625. For 16-bit binary numbers, it can be accurate to 1/216 = 0.000015258 or a million. This precision can meet most of the requirements. On the other hand, if you use only 16 bits to store an integer part, the numerical range of its storage is -32767 ~ 32768 (no sign number to 65535). This limitation will become a problem in huge universe or digital space, so we must prevent overflow issues. The addition and subtraction of the addition and subtraction points is relatively simple. You can use standard and-operators: fixpoint f1 = float_to_fix (10.5), F2 = float_to_fix (-2.6), f3 = 0; // Zero is 0 no matter what baby

// to add themf3 = f1 f2; // to Subtract themf3 = f1 - f2; Note Due to the number of binary complement indicates, there is no problem when the above operation is performed.

Multiplication and division operation multiplication and division operations are slightly more complicated than adding and subtraction operations. The problem is that the fixed number is the number of amplified. When multiply, you are not only multiplying the number of fixed points, but also multiplied the amplification factor. Take a look at the source code: f1 = n1 * scalef2 = n2 * scalef3 = f1 * f2 = (n1 * scale) * (n2 * scale) = n1 * n2 * scale2 See an extra magnification? To correct this problem, you need to divide or remove the square scale2 of the amplification factor. Thus, the two-dimensional number of points multiply as follows: F3 = ((f1 * f2) >> fp_shift); the division of the fixed point will also encounter problems similar to multiplication, but the results are reversed. As shown below: f1 = n1 * scalef2 = n2 * scale assumes that: f3 = f1 / f2 = (n1 * scale) / (n2 * scale) = N1 / n2 // no scale! Note in the operation process The amplification coefficient is thus obtained is non-fixed number. This is very useful in some cases. But to be a fixed point, enlargement must be enlarged: f3 = (f1 << fp_shift) / F2; the multiplication and division of the warns must have an overflow and underflow problem. As far as multiplication is concerned, the worst case is that we have to use 64 bits. Similarly, the height of the division molecule is usually lost, only the fractional portion remains. The solution is to operate using 24.8-bit formats or all 64-bit formats. This can be implemented using assembly language because Pentium and the above processors support 64-bit operations. Or, you can also change the format slightly, use 24.8-bit format. This can meet the multiplication and division operation of the fixed point, and all information is lost until it is lost. However, your precision will still drop. Program Demo11_4.cpp | EXE is an example of a fixed point calculation. This program allows you to enter two decimals, then perform a fixed point calculation and observe its calculation results. Note that the multiplication and division operations may be incorrect, because the calculation results are in a 16.16 format rather than a 64-bit format. To correct this, you can use 24.8 format to recompile the program. Condition Compiling Two #defines controls at the top of the program: // #define fixpoint16_16 // #define fixpoint24_8 Delete the comment of one of the rows, the compiler can start compiling. This is a console application, so if the Black Movie Director Spike Lee said: "For what you want ..."

The next optimization technique is unfolded by the cyclic body. In 8 and 16 times, this is one of the best optimization technologies, but today it may bring the opposite consequence. Expanding cycles means breaking a repeating multiple cycles and manually programming each line. For example: // loop before unrollingFor (int index = 0; index <8; index ) {// do work = data [index];} // end for index This loop problem is that the time spent is less than circulating Quantity, comparison and time taken. In this way, the time required for the loop itself is two or three times the work code. You can perform the following loop deployment: // the unrolled versionsum = data [0]; sum = data [1]; sum = data [2]; sum = data [3]; sum = data [4]; SUM = DATA [5]; SUM = DATA [6]; SUM = DATA [7]; this is much better. However, there is a need to say that if the cyclic body is more complicated than the cycle structure, it is not necessary to expand it. For example, if you want to calculate the square root in the circulation body, it is not necessary to start. Since the Pentium processor has an internal cache, it will cause a loop to expand too much to cause congestion of internal cache. This will be catastrophic and will result in an endless code. My suggestion is that the number of cyclic deployments should be between 8 and 32 times. Find a table This is my personal favorite optimization skill. The check form method is some results in advance. All possible results are simply calculated when the program starts, and then runs the game. For example, assuming that you need to calculate the sinusoidal and cosine values from 0 to 359 angles. If you use a floating-point processor to calculate SiN () and COS (), it will take time. However, the check-up method is used, and the sinusoidal and cosine values of each angle can be obtained only a few CPU cycles. For example: // Storage for Look Up Tablesfloat SIN_LOOK [360]; Float Cos_look [360];

// create look-up tablefor (int angle = 0; angle <360; angle ) {// convert angle to radians since math library uses // rads instead of degrees // remember there are 2 * pi rads in 360 degrees float rad_angle = Angle * (3.14159 / 180);

// Fill in Next Entries in Look-Up Tables SIN_LOOK [Angle] = SIN (RAD_ANGLE); COS_LOOK [Angle] = COS (RAD_ANGLE);} // end for Angle The following is an example of using the lookup table, the code The result of the execution is a circle that draws a radius of 10: for (int Ang = 0; ang <360; ang ) {// compute the next Point on circle x_pos = 10 * cos_look [Ang]; y_pos = 10 * sin_look [ang a ];

// Plot The Pixel Plot_pixel ((int) x_pos x0, (int) y_pos y0, color);} // end for angs Of course, the lookup table needs to take a certain amount of memory, but this is also worth it. "If you can pre-calculate the result, put it in the lookup table." This is my motto. How can you think about doom, quake, and my favorite HALF-Life work? The last optimization I want to discuss is to use assembly language. You may have a cool algorithm and a good data structure, but you still want more efficient. Hand-write assembly language no longer makes code running on the 8/16-bit processor in the 8/16-bit processor, but it typically makes your code run speed by 2 to 10 times. This shows that manually preparing assembly language code is still very worthwhile. Of course, you must ensure that only the code part that needs to be converted only in the game program. Note that you don't need to optimize the main menu program, because it is just a waste of time. Test where your game program is running with a performance test tool (possibly in the graphic part), then locate it in assembly language. I recommend using Intel's VTune as a performance test tool. In the past (several years ago), most compilers do not support inline assembly. It is also very difficult to use if there is! Today Microsoft, Borland and Watcom compilers provide support for inline deployment, which is used to write dozens of trips to hundreds of sentences, as well as separately using assemblers alone. So I recommend using the inline assembler if you need assembly statements. The following code indicates how to call the inline assembly when using Microsoft VC : _ASM {.. assembly material language code here} // End ASM Inline Sugal Suggestifier The most prominent advantage is that it allows the use of variable names defined in C / C . The following code demonstrates how to use the inline assembly language to write a 32-bit memory filling function: void Qmemset (void * memory, int value, int num in) {// this function buy 4 bit askembly language based // and the string instructions to fill a region of memory_asm {CLD // clear the direction flag MOV EDI, memory // move pointer into EDI MOV ECX, num_quads // ECX hold loop count MOV EAX, value // EAX hold value REP STOSD // perform fill } // end asm

} // End Qmemset To use this new function, you only need to call: QMemSet (& Buffer, 25, 1000);

Thus starting from the start address of the buffer, 1000 quads are filled one by one to the value 25. Note If you are not Microsoft VC , you should check the help you use, in mind the syntax format you need to do in the white liner. In most cases, they are only different.

Production Demonstration If you have completed the writing program, you need a demo mode. There are two main methods for making demonstrations: You can play this game and record your actions, or you can use an artificial intelligent player. Record your own gameplay is the most common choice. Because writing an artificial intelligent player like a live person, it is very difficult, and in order to leave a good impression on potential buyers, it will inevitably ask the artificial intelligent player to play games in a very cool way. It is also very good. difficult. Let us look at how these two methods are achieved. Pre-recorded demo in order to record a paragraph presentation, basically, you want to record the state of each cycle, write the data into the file, and then make a presentation as the input of the game engine. Take a look at the A and B in Figure 11-13. The idea of this method is that the game itself does not know that the input is from the keyboard (input device) or the file, so this demonstration is just simply playing back the game.

Figure 11-13: Demonstration Playback To make it work, you must have a deterministic game policy: if you play this game again and the same game, then the game character will also do the same thing. This means that like recording input devices, you must record the initial random number to record the starting status of the game record as input. Doing so ensures that the game presentation will play the same state as you record. The best way to record a game is not sampled at a certain time interval, but the input is sampled each frame. In this way, this demonstration can be synchronized with the game when the computer is slow and playback. My usual approach is to incorporate all the input devices into a record, one frame, and then make these records into a file. I will play the status information of the demo or the random number placed in the beginning of the file to facilitate the load. Therefore, this playback file is as follows: Initial State Information

Frame 1: INPUT VALUESFRAME 2: Input Valuesframe 3: Input Values..frame N: INPUT VALUES

Initial status information

First Frame: Enter the value of the second frame: Enter the value of the third frame: Enter the value .. nth frame: input value

Once you have this file, you can play the game after you will play the game. Subsequently read the file, as if the data is input from the input device. The game itself doesn't know this difference! Warning The single worst mistake that you can make is sampling the input at the wrong time when you're writing out records. You should make absolutely certain that the input you sample and record is the actual input that the game uses for that frame. A common mistake that newbies make is to sample the input for the demo mode at a point in the event loop before or after the normal input is read. Hence, they're sampling different data! It's possible that the player might have the fire button down In One Part of The Event loop and not in another, so you must sample at the hand point you read the invut for the game number. You may make the worst error: when writing a record, in an improper time Sampling the input. In fact, it is important to ensure that the input of the sampling and record is the input actually used by the game corresponding frame. An newcoming mistake is such that the sampling advance or lagging behind the game demonstrates the correct input time. Therefore, the data sampled is different! Its possible result is that the game player presses the transmit button in some part of the game event cycle, but it is loosened on the other. Therefore, sampling and reading input must be performed at the same place. The second method of the demonstration record game by artificial intelligence is to perform the game by means of the artificial intelligence "BOT" (robot), just like people to play Quake. The BOT will play games like a manual intelligence role involved in the game when the game is in demo mode. The only problem of this method (except for technology) is BOT may not show all "cool" rooms, weapons, etc., because it does not know that it is making a game. On the other hand, the biggest benefit of using BOT participates in the game is that every presentation is different, and this diversity is valuable when showing the game, because the viewer will not feel boring. Make BOT in the game and make other artificial intelligence characters. Basically, you only need to connect it to your game input interface and overload the standard input stream, as shown in C of Figure 11-13. Then write the artificial intelligence algorithm for the BOT, set some main goals, such as finding the path of the maze, what you see, or other tasks, etc. It is simple, you only need to run by BOT until the player takes place.

Save the game's means in the game programming, write a save game section is one of the most distacation. This is one of the game programmers to finally do. The key is to write a game, it is necessary to consider the game you have written to provide the player to save the progress of the game. Save the game at any time means to record each variable in the game and each object. So in a file, you must record all global variables and the status of each object. The best implementation path is to use object-oriented methods. Instead of writing a function to record the status of each object and all global variables, it is better to make each object know how to read its own status and write to disk files. To save the game, what you have to do is to write global variables and create a simple function. Write itself by each object in the game. Then, when you need to read back the game progress, you have to do just to read these global variables into the system, then read all objects to the game. With this way, if you add an object or object type, the load / save process is only limited to the object itself without affecting the entire program. Realizing multiplayer game programming tricks is a multiplayer game. Of course, if you want to write a online game, it is another matter - although DirectPlay makes the communication part easier. However, if you want to let two or more players play with your game at the same time or turn around, you only need to add some additional data structures and adjust the program slightly.

The realization of the turbine turns is both simple and complicated. Say it is simple because since you can achieve a player, in order to achieve two or more players, you only need to provide more than one game player record. It is difficult to say that you must provide the game saved for each player when switching. So in general, if your game needs to have a turnt option, you must implement the saved function in the game. Obviously, the game player does not know that the game has been saved during the rotation. Based on this, the production steps needed to be listed in turn of the two rounds of playing games: 1. Start the game, players 1 start. 2. Players 1 play game until the end. 3. The game status of players 1 is saved, and players 2 start. 4. Players 2 play game until the end. 5. The status of players 2 is saved (this time is rotated). 6. Reload the player 1 previously saved game, player 1 continues. 7. Back to step 2. You can see that the rotation occurs in step 5, and then the game is in the two players. If the game players are more than two, just simply in which they are turned in (only one person can play) until the end of the last one, then start from the beginning.

The split screen implements two or more players to play games while playing games in the same screen, which is more complicated than players. Because you have to write the game to be complex - consider the game rules, conflicts, and interactions between players. And when you play at the same time, you must assign a specified input device for each player. This usually refers to each player allocated a game joystick, or a player uses the keyboard and the other uses the joystick. There is also a problem with the game at the same time, and some games are not suitable for doing so. For example, in the reel game, a player wants to go this road and another player wants to take another road. This will cause conflict, you have to consider it. Therefore, it is the game that is best suited to play at the same time, or more people go together for the same goal. But if you allow the player to move freely, you can create the display of the split screen shown in Figure 11-14.

Figure 11-14: Split screen showing the only problem with multi-screen display is multi-screen display! You must have two or more game screens. This is very challenging in technology. In addition, the screen may not have enough space to display two or more pictures, so the player is hard to see what happens. But if you can implement a split screen, at least it is a very cool option ...

Multi-threaded programming technology has so far, all demonstration procedures discussed in this book are all use of single-threaded event loops and programming models. The event loop responds to the player's input and renders the game screen at a speed of 30 frames per second. While reacting to the player, the game should perform millions of calculations per second, and handle dozens or hundreds of items such as drawing all objects, obtain small tasks such as input data, play music. Figure 11-15 shows the standard game cycle. Figure 11-15: Standard DOS Single Task Game Cycle is known in Figure 11-15, the game logic completed various tasks in a serial (order) method. Of course, there are also exceptions, such as through interrupts, complete some simple logical tasks, such as music, input control, etc. But in general, the game is a long function call sequence. Although every task is executed in order, because the computers are fast enough, the results of its execution are happening simultaneously, so that the game looks very smooth and true. Therefore, most gaming programs are a single task execution thread - sequentially perform some operations and output the results required for each frame. This is one of the best ways to solve the problem, and the inevitable result of DOS game programming. However, today, the DOS era has become the past, so it is time to play Windows 95/98 / ME / XP / NT / 2000 multi-threaded power, I will bet you will like it! This section will be discussed on the thread of execution under Windows 95/98 / NT. You can easily run multiple tasks in an application by using threads. Before you start, let's take some terms first, so that they will not be too abrupt when they are mentioned. Multithreaded programming terms have a variety of words that start with "Multi-" in computer dictionary. Let us talk about multiprocessor and multiple processing, and finally discuss multithreading. A multiprocessor computer refers to a computer having multiple processors. CRAY and Connection Machine are these. The Connection Machine computer can install up to 64,000 processors (constituting a super cube network), each of which is used to execute code. For general consumers, the purchase of four PIII processors can be used to run Windows NT may be more practical. Usually they are SMP (symmetrical multi-process, Symmetrical Multiprocessing) system, that is, four processors can perform tasks symmetrically. In fact, the situation is not always, because the operating system core is only running on one processor, but as the number of processes increases, other tasks are evenly running on each processor. Therefore, the concept of multiprocessor computer is to share the workload with multiple processors. For some systems, only one task or a process can be executed on each processor, and thousands of tasks can be run on each processor, such as Windows NT, each processor. Basically this is a multiple process - multiple tasks run on a computer having a single (or more) processors. The latest concept is multithreading, and it is also the most interested term today. The process under Windows 95/98 / NT / 2000 is a complete program; although sometimes the process cannot run independently, it is usually an application. It can have its own memory space, context, and independently. More than the process, thread is a simpler program entity. The thread is created by the process, which is different from each other, and the structure is simple and run in the address space of the process that creates their processes. The wonderful thread is that they can get as much processor time and exist within the address space of creating their parent processes.

This means that communication with thread is very simple. Essentially, the thread is what the game programmer is required: an execution thread works parallel and other main program tasks and can access variables in the program. Since you have a "multi-" prefix, you need to figure out a few concepts. First, Windows 95, 98, NT, and 2000 are multitasking / predecessor operating systems. This indicates that any tasks, processes or threads cannot fully control computers. Each task, process or thread can be interrupted or blocked some degree, and the next execution thread is allowed to start. This is completely different from Windows 3.1 - in Windows 3.1 is not a predecessor. If you don't call getMessage (...) in each cycle, other processes do not work. And under Windows 95/98 / NT / 2000, you can set an infinite for loop, and the operating system can still operate other tasks as usual. In Windows 95/98 / NT / 2000, each process or thread has a priority, which determines the runtime before the process or thread is interrupted. Therefore, if there are 10 identical priority processes, their running time is processed in a loop manner. However, if there is a thread having the priority of the kernel level, the thread will get more running times in each cycle, as shown in Figure 11-16. Figure 11-16: Process for rotation thread with equal or unequal priority

Finally, the problem has appeared: What is the difference between multi-threading between Windows 95/98 / NT / 2000? There are certain differences between them. But the program that meets the Windows 95 operating system model can be safely running on all other platforms. This is the most basic operating system. Although 98 and NT stability is better, I will still use the Windows 95 machine to run most of the program instance. Why do you want to use threads in the game? Now this answer is very obvious. In fact, I think you can list 1,000 things to do with threads at any time. However, if you can't do this (for example, you have just woke up from Mountain Dew (or Sobe, I love it in love with it)), let me list some places to use multi-threaded programming:? Update Animation? Generate a surround sound effect? Control small object? Query input device? Update global data structure? Creating pop-up menus and control above the last item I often use. Create a menu when the game is running and allows the player to change the settings, which has always been a headache. But it is much simpler with the threading process. So far, I still don't answer why I want to use threads in the game program without using a huge loop and function calling this problem. Indeed, the work of thread is completed can also be completed, but when the object-oriented program you created, you need to propose a structure similar to the automaton (Automaton). These are objects representing the game role - you want to have no logical side effects on the game main loop when creating and destroying. This can be achieved by C classes and combined with multi-threaded programming. Before starting your first multi-thread program, let's figure out the fact: On a single processor machine, only one thread can be run at a time. So there is no free lunch in the world, but after all, this is the methodology of adapting to software, so make sure you are multi-threaded programming for simple and correctness. Figure 11-17 shows a case where a primary process and three threads are simultaneously performed. Figure 11-17: The main process generates three sub-charts indicating that different threads' occupancy time is in milliseconds. As you can see, only one thread is running at a time, but they can be run in order, and determine the runtime according to the high priority. The grand play is enough. Let us see some code! Get a thread in the example below, you will use the console mode program. Also emphasized, please compile these programs correctly. (I have already said, because I have to receive 30 ~ 60 readers from the readers of the books I wrote every hour, I have an email using the VC compiler. Is there a preface?) There is also a The warning is: For these examples, you must use the multithreaded library that supports multi-threaded library. Enter the main menu of MS DEV Studio, select Project, Setting, in the Category: Code GeneRtrion option in the C tablet, set Use Run-Time Library to Multithreaded. As shown in Figure 11-18. In addition, make sure the Optimization option is set to OFF. Because sometimes this option affects multi-threaded synchronous code, it is best to turn it off to prevent it. Figure 11-18: Creating a console application using a multi-line library Note that I have a feeling of recognition. Really I have met? Or is it a random illusion? If you don't feel this, you will not know what you are not, it doesn't matter. :)

Everything is ready, let's get started. Creating a thread is very simple, and preventing it from being damaged is a difficulty part! Format Win32 API call is as follows: HANDLE CreateThread (LPSECURITY_ATTRIBUTES lpThreadAttributes, // pointer to thread security attributes DWORD dwStackSize, // initial thread stack size, in bytes LPTHREAD_START_ROUTINE lpStartAddress, // pointer to thread function LPVOID lpParameter, // argument for new thread DWORD DWCREATIONFLAGS, / / CREATION FLAGS LPDWORD LPTHREADID; // Pointer To Returned Thread IdentifierLPThReadAttributes Points to a Security_Attributes structure that specifies the security properties of this thread. If this LPthReadAttributes value is NULL, the thread is created with the default security description word, and the returned handle will not be inherited. DWSTACKSIZE Specifies the size of the thread stack (unit is byte). If the value is 0, the size of the stack is the same as the main thread of the process. The stack is automatically assigned in the process's memory space and is released at the end of the process. If necessary, the stack size can be increased. CreateTHREAD attempts to allocate the size of the DWSTACKSIZE byte and return allocation failure messages when the available memory is insufficient. LPSTARTADDRESS points to the function provided by the application to be executed, while this also represents the start address of the thread. This function accepts a 32-bit parameter and returns a 32-bit value. LPPARAMETER Defines a 32-bit parameter that passes to a thread. LPTHREADID points to a 32-bit variable that holds the thread identifier. If the function is successful, its return value is a handle of the next new thread. Returns NULL if the function is executed. To get more error messages, call the getLastError () function. The Function Call Might Look A Bit Complex, But it's real not. IT Just Allows a Lot of Control. You Won't use much. This function call looks a bit complex, but not. It just provides more control features. In most cases, there will be no more features you will use. When processed a thread, you should close the handle of the thread. In other words, it is to let the system know that you no longer use the object. This feature is implemented by calling a function closeHandle (), which uses the CreateTHRead () function to return the handle and will reduce the reference counter that should be kernel objects. This needs to be processed for each thread. This will not force the thread, just to tell the system, the thread ending the operating state. The thread either ends yourself, or notice (using the function TerminateThread ()), or ends at the end of the operating system at the end of the main thread. These will discuss one by one later, now just know that this is a call that must be performed before exiting the multithreader program. Below is the prototype of this function: BOOL CloseHandle (Handle Hobject); // Handle To Object To ClosehObject represents an open object handle.

If the function call is successful, TRUE will be returned; if it fails, it returns false. The call function getLastError () can get detailed error information. In addition, CloseHandle () is also suitable for closing the handle of the following object: • Console input or output? Event file? File mapping? Mutex? Named pipe? Process? Semaphore? Thread basically CloseHandle () makes the specified object handle, and reduces the reference count of the object handle and performs the object survival test. When the last handle of an object is turned off, the object is removed from the operating system. WARNING Handle of a new thread has full access to the thread when creating. If a security descriptor is not provided, the handle can be used by any function that requires the thread handle. When the security descriptor is provided, all access to the handle must be permission check. If the results are rejected, the requested process will be rejected to access the thread using the handle. Now look at some code, they represent a thread that can be passed to the function CreateThread (): DWORD WINAPI MY_THREAD (LPVOID DATA) {// .. do Work

// Return An Exit Code At, WHATER IS Appropriate for Your APP

Return (26);} // end my_thread Now you have all the conditions required to create your first multi-thread application. The first example will show you a single-threaded creation process. The created from the thread printed numbers 2, the main thread (ie, the main program) will print the number 1. Demo11_5.cpp includes the entire program, as shown below: // Demo11_5.cpp - Creates a Single Thread That Prints // Simultaneously While The Primary Thread Prints.// Includes

#define Win32_Lean_and_mean // make Sure Win Headers // Are Included Correctly

#include

// include the standard Windows Stuff

#include

// incrude the 32 bit stuff

#include

// defines

// Prototypes //

DWORD WINAPI PRINTER_THREAD (LPVOID DATA);

// globals /

// functions ///

DWORD WINAPI Printer_thread (LPVOID DATA) {// This Thread Function Simply Prints Out Data // 25 Times with a Slight Delay

For (int index = 0; index <25; index ) {printf ("% d", data); // output a single character sleep (100); // sleep a little to slows down} // end for index

//just return the data Sent to the thread function of the THREAD FUNCTION

Return (DWORD) DATA;

} // endprinter_thread

// main /////

Void main (void) {handle thread_handle; // this is the the handle to threaddword thread_id; // this is the id of the thread // start with a blank lineprintf ("/ nstarting threads ... / n"); / / create the thread, IRL we would check for errorsthread_handle = CreateThread (NULL, // default security 0, // default stack Printer_Thread, // use this thread function (LPVOID) 1, // user data sent to thread 0, // Creation flags, 0 = start now. & thread_id); // send id back in this var

// now Enter Into Printing Loop, Make Sure Takes Longer Than Thread, // So Thread Finishes Firstfor (INT INDEX = 0; Index <50; Index ) {Printf ("2"); SLEEP (100);} // End for index

// at this point the thread shop be deadclosehandle (thread_handle);

// end with a blank lineprintf ("/ Nall Threads Terminated./N");

} // end main output example: Starting Threads ... 2 1 2 1 2 1 2 1 2 2 1 2 2 1 2 2 2 2 2 1 1 22 1 1 2 2 1 1 2 2 1 1 2 2 1 1 2 2 1 1 2 2 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2all threads terminated. As you can see the output result, each Threads are only running a short period of time, and then the system switches to another thread that is waiting to run. In this case, the operating system simply switches in the main thread and from the thread. Let us now try to create a multi-thread program. You only need to make this feature only for Demo11_5.cpp. You only need to call the CreateThread () function multiple times, and you create a thread each time you call. And each time delivery to the created thread will be printed, so you can distinguish the thread you created. Demo11_6.cpp | EXE contains a modified multi-threaded program, I will list below for your reference. Note that I use the array to store thread sessions and ids. // demo11_6.cpp - a new version That creates 3 // Secondary Threads of Execution // Includes /

#define Win32_Lean_and_mean // Make Sure Certain Headers // Are INCLUDED CORRECTLY

#include

// incrude the standard Windows stuff # include

// incrude the 32 bit stuff

#include

// defines ////

#define max_threads 3

// prototypes ///

DWORD WINAPI PRINTER_THREAD (LPVOID DATA);

// globals /

// functions //

DWORD WINAPI PRINTER_THREAD (LPVOID DATA) {// This Thread Function Simply Prints Out Data // 25 Times With A Slight DelayFor (int index = 0; Index <25; Index ) {Printf ("% D", (int) DATA 1); // Output a Single Character Sleep (100); // Sleep A Little To Slow Things Down} // End for Index

// Just Return The Data Sent To The Thread FunctionReturn (DWORD) DATA;

} // endprinter_thread

// main /////

Void main (void) {

Handle Thread_Handle [Max_threads]; // this Holds the // Handles To The Threadsdword Thread_ID [MAX_THREADS]; // this Holds The IDS of The Threads

// Start with a blank lineprintf ("/ nstarting all threads ... / n");

// Create The Thread, IRL We Would Check for Errorsfor (int index = 0; index

{Thread_handle [index] = CreateThread (NULL, // default security 0, // default stack Printer_Thread, // use this thread function (LPVOID) index, // user data sent to thread 0, // creation flags, 0 = start Now. & thread_id [index]); // send id back in this var} // end for index

// now Enter Into Printing Loop, Make Sure // This Takes Longer Than Threads, // So Threads Finish First, Note That Primary Thread Prints 4for (INDEX = 0; Index <75; Index ) {Printf ("4"); Sleep (100);} // end for index // at this point the threads surplend all be dead, so close handlesfor (index = 0; index

CloseHandle (thread_handle [index]);

// end with a blank lineprintf ("/ Nall Threads Terminated./N");

} // end main output example: Starting All threads ... 4 1 2 3 4 1 2 3 4 1 2 3 1 4 2 3 4 1 2 3 1 4 2 3 41 2 3 1 4 2 3 4 1 2 3 1 4 2 3 4 1 2 3 4 1 2 3 4 12 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 23 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 34 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4All Threads Terminated. Wow! Is it cool? Creating multithreading is so easy. If your mind is faster, maybe you have already tired, and you will question: Why use the same function every thread callback? The reason for this is that all variables are created in the stack, and each thread has their own stack. So each thread can work properly. As shown in Figure 11-19. Figure 11-19: Main thread and from thread memory and program space profiles Figure 11-19 describe a very important point: Termination. Both threads are endless, but the main thread is not controlled. In addition, the main thread cannot determine if other threads have been run or have been terminated (as long as they can also return). We need a way to communicate and check the status of each thread between threads. The use of the function TerminateThread () end function is a mandatory method, which is generally not recommended for readers. Message delivery between threads let us see how the main thread controls the sub-threads they created. For example, the main thread may need to end all sub-threads. How can I achieve it? There are two methods to end a thread: • Send a message to the thread to end (the correct method). • Forcurate the end of the thread (wrong method) using the internal core level. Although the wrong method has to be used in some cases, it is not safe. Because this approach simply reclaims the part of the thread. When the thread needs to perform the cleaning operation, it will not be possible. This will cause leakage of memory and resource information. So you should be cautious when using this method. Figure 11-20 illustrates the process of ending the thread using both methods. Figure 11-20: Thread Termination Method First, look at how to use the TerminateThread () function, then look at an example of sending end messages to threads to notify the thread to perform end operations. Bool TerminateThread (Handle Hthread, // Handle to the Thread DWord Dwitcode); // EXIT CODE for THREADHREAD indicates the thread to end. This handle must be able to access Thread_Terminate. DwExitCode defines the exit code of the thread. Use the getExitCodetRead () function to get the exit value of the thread. If the function call is successful, the return value is true; otherwise the return value is false. The call function getLastError () can get detailed error information. The TerminateThread () function is used to exit a thread. When this function is called, the target thread stops performing any user code, and its initial stack will not be released. The dynamic connection library connected to the thread does not receive a notification that the thread is ending, this is one of the difficult places. :) The usage of the TERMINATTHREAD () function is very simple, just simply call the handle of the end thread, and the returned code is overloaded, and there will be no longer exist. But don't misunderstand what I mean, after all, if this function is useless, it will not exist.

Therefore, use it to make sure you have considered the consequences of this function that may bring. The thread termination method that is transmitted by a global variable monitored by thread is described below. When the global termination flag is detected from the thread, it ends all from the thread. But how do the main thread know that all from the thread end? One way to complete this feature is to set another global variable that is decremented by the thread - that is, a reference counter. The counter can be monitored by the primary thread. When it is 0, all from the thread has ended, and the main thread can safely continue working and turn off the thread. Here is a complete message delivery system example, then we are not far from the real programming. Demo11_7.cpp | EXE demonstrates the process of global messaging, as shown below: // demo11_7.cpp - An Example of Global Message Passing to control // Termination of threads.//INCLUDES / / /

#define Win32_Lean_and_mean // Make Sure Certain Headers // Are INCLUDED CORRECTLY

#include

// include the standard Windows Stuff

#include

// incrude the 32 bit stuff

#include

// defines //

#define max_threads 3

// Prototypes //

DWORD WINAPI PRINTER_THREAD (LPVOID DATA);

// globals /

INT TERMINATE_THREADS = 0; // Global Message Flag to TerminateInt Active_threads = 0; // Number of Active Threads

// functions

DWORD WINAPI Printer_thread (LPVOID DATA) {// This Thread Function Simply Prints Out Data Until It Is Told To Terminate

For (;;) {Printf ("% D", (int) DATA 1); // Output A Single Character Sleep (100); // Sleep A Little To Slow Things Down

// Test for Termination Message if (Terminate_threads) Break;

} // end for index

// Decrement Number of Active Threadsif (Active_threads> 0) Active_threads -;

// Just Return The Data Sent To The Thread FunctionReturn (DWORD) DATA;

} // endprinter_thread

// main //

Void main (void) {

Handle Thread_Handle [Max_threads]; // this Holds the // Handles To The Threadsdword Thread_ID [MAX_THREADS]; // this Holds The IDS of The Threads

// Start with a blank lineprintf ("/ nStarting threads ... / n"); // Create the thread, IRL We would Check for errorsfor (int index = 0; index

// increment Number of Active Threads Active_Threads ;

} // end for index

// now Enter Into Printing Loop, Make Sure this // Takes Longer Than Threads, // So Threads Finish First, Note That Primary Thread Prints 4

For (index = 0; index <25; index ) {printf ("4"); SLEEP (100);} // end for index

// at this point all the threads are still running, // now if the keyboard is hit // then a message will be sent to terminate all the // threads and this thread // will wait for all of the threads to message in

While (! kbhit ());

// Get That Chargetch ();

// set global termination flagterminate_threads = 1;

// Wait for all threads to terminate, // when all are terminated active_threads == 0while (active_threads);

// at this point the threads should all be dead, so close handlesfor (index = 0; index

} // end main output example: Starting Threads ... 4 1 2 3 4 2 1 3 4 3 1 2 4 2 1 3 4 3 1 4 2 1 3 4 2 3 1 4 2 1 3 4 2 3 1 4 2 3 1 4 2 3 1 4 2 3 1 4 2 3 1 4 2 3 1 4 2 3 1 4 2 3 1 4 2 3 1 4 2 3 1 4 2 3 1 4 2 3 1 4 2 3 1 4 2 3 1 4 2 3 1 4 2 3 1 4 2 3 1 2 3 1 3 2 1 2 3 3 2 1 2 3 3 2 1 2 3 3 2 1 2 3 3 2 1 1 2 3 3 2 1 2 3 1 3 2 1 2 3 1 3 2 3 1 3 1 3 2 3 2 1 3 1 2 3 2 1All threads Terminated. When the user taps a button All threads are ended, and then the main thread is also ended. This approach has two problems. The first problem is not obvious. Below is its specific case, you will see a few more times you will find a problem: 1. Assume that only one from the thread is not closed. 2. Assume that the last thread has control over the processor and decrements the global variable value of the trace of the number of threads. 3. Just this moment, the process switches to the main thread. The main thread monitors global variables and thinks all threads have ended, but the last thread has not returned! In most cases, this is not a problem, but if there is code between decrementing code and returning code, this problem will occur. So we need a function to ask if the thread has ended. Many cases, this is very helpful. Refer to the wait * () series function, it will be greatly beneficial to the programming. The second question is that when you create a busy loop or a poll of the polling. Under Win16 / DOS system, the loop is well implemented, but in Win32, it is very bad. In a closed loop, waiting for a variable, bringing a heavy burden to multi-tasking kernels and seriously occupying CPU resources. It can be measured using Sysmon.exe (Windows 95/98 / ME / XP attachments) included in Windows, PerfMon.exe (Windows NT / 2000) or similar third-party CPU usage test tools. These tool software will help you understand the operating status of the thread and the usage of the processor. Next, let's see how the wait * () class function will help us determine if a thread ends.

Waiting for the right opportunity Get ready for the most confusing explanation you've ever heard ... but it's not my fault, really! Whenever any thread terminates, it becomes signaled to the kernel, and when it is running, it is unsignaled. Whatever that means. And what is the price of plastic zippers Tomorrow? You don't care! But what you do care about is how to test for the signaling. We will issue signals to the kernel at the end of any thread, and they run No signal (unsignaled) is issued. What need to be understood how to monitor these signals. You can use the wait * () function family to implement event monitoring, which functions can realize single-signal or multi-signal detection. In addition, you can call one of the Wait * () functions to wait until the signal is generated. This can avoid using a busy cycle. In most cases, doing so far better than polling global variables. Figure 11-21 illustrates the working mechanism of the wait * () function and the relationship between the running program and the OS core. Figure 11-21: Two functions required to use the signal time relationship using the wait * () function are WaitForsingleObject () and WaitFormultiPleObjects (). These two functions are used for single signal and multi-signal detection. They are defined as follows: DWORD WAITFORSINGLEOBJECT (Handle Hhandle, // Handle Of Object To Wait For DWORD DWMILLISECONDS); // Time-Out Interval In MilliseCondshhandle is used to determine the object. DWMilliseConds defines a timeout, in milliseconds. If the interval has passed, it is necessary to return even if the detection object has no signal. If the DWMilliseConds value is 0, the function immediately read the status of the object and returns. If its value is infinite, the function never timeouts. If the function succeeds, the return value indicates the event that caused the function to return. If the function fails, the return value is WAIT_FAILED. To get extended error information, call GetLastError (). If the function succeeds, the return value comprising Returned condition status. If the function is executed, the return value is WAIT_FAILED. The call function getLastError () can get detailed error information. The success return value of this function has the following: • Wait_abandoned - The specified object is a mutex, the object is not allowed to be released before the end of the thread. The ownership of the mutually exclusive object is granted a call thread, and the mutex is set to no signal. ? Wait_Object_0 - The status of the specified object is signal. WAIT_TIMEOUT - Timeout has passed, the status of the object is no signal. Generally speaking, the waitforsingleObject () function checks the current state of the specified object. If the object has no signal, the calling thread enters a very efficient wait state. During this time, the thread only occupies very little CPU time until one of its conditions is satisfied.

The following is a multi-signal detection function, primarily used to terminate multiple threads: DWORD WAITFORMULTIPLEOBJECTS (DWord Ncount, // Number of Handles // In Handle Array Const Handle * LPHANDLES, / / Address Of Object-Handle Array Bool Bwaitall, // Wait Flag DWORD DWMILLISECONDS; // Time-Out Interval IN MilliseCondsncount defines the number of elements in the object handle arguments pointed to the LPHandLES. The maximum number of object handles is Maximum_Wait_Objects. LPHANDLES Points an array of object handles. This array contains the handle of different types of objects. Note that for Windows NT: The handle must have SYNCHRONIZE access. BWAITALL defines the waiting type. If the value is true, all objects in the LPHandles array have a signal at the same time. If the value is false, then any other object returns. For the latter case, the return value indicates the object that causes the function returned. DWMilliseConds defines timeout (milliseconds). Even if the criterion defined by the BWAITALL parameter is not satisfied, the function is also returned as soon as it has passed over time. If the value of dwmilliseconds is 0, the function immediately detects the status of the specified object and returns. If its value is endless, then the function never timeouts. If the function is executed, its return value represents the event that causes the function returned. Returns wait_failed if the function fails. The call function getLastError () can get detailed error information. The function return value mainly has the following:? Wait_Object_0 to (wait_Object_0 ncount - 1) - If the BWAITALL value is true, the return value indicates that all specified objects have signals. If its value is false, the return value subtracts WAIT_Object_0, which gives the LPHandLES array index that satisfies the waiting object. If an object is detected during the call, the minimum value in the index index in the signal object occurs is detected when more than one object is detected. WAIT_ABANDONED_0 to (WAIT_ABANDONED_0 NCOUNT - 1) - If the BWAITALL value is true, the return value indicates that all specified objects have signals, and at least one object is an abandoned mutually exclusive object. If the BWAITALL value is false, the return value is subtracted from Wait_abandONED_0, which gives the LPHANDLES array index that satisfies the wait for discarded mutual exclusive objects. WAIT_TIMEOUT - Over time separated, it is not satisfied with the conditions specified by the BWAITALL parameter. The waitformultipleObjects () function determines if the waiting condition for exiting is satisfied. If the wait condition is not satisfied, the calling thread enters an efficient wait state until the conditions are satisfied. In this state, the thread consumes very little system resources.

Synchronizing threads using signaling is very technical, so it is necessary to illustrate the usage of these functions, as long as the code in the previous example is slightly modified. In the next release, you will remove the global end signal flag and create a primary loop of a simple call function wait for WaitForsingleObject (). Removing the global end message is just to make the program simpler. It is undeniable that it is still the best way to notify the thread. However, since it is in a busy cycle, it is not the best way to test the thread itself. This is also why the WaitForsingleObject () call is used. This call is in a virtual wait loop in a small CPU time. And because the function waitforsingleObject () can only wait for a signal, it can only be used for the end of a thread, so there is only one from the thread in this example. Later, we will override the program. The new program will contain three threads and wait for them to end with WaitFormultiPleObjects (). Demo11_8.cpp | EXE uses WaitForsingleObject () to end the single thread, and create another thread whose code is as follows: // Demo11_8.cpp - a single threaded example of // WaitForsingleObject (...). // Includes // # Define Win32_Lean_and_mean // makessa // Headers Are Included Correctly

#include

// include the standard Windows Stuff

#include

// incrude the 32 bit stuff

#include

// defines //

// Prototypes //

DWORD WINAPI PRINTER_THREAD (LPVOID DATA);

// globals /

// functions //

DWORD WINAPI Printer_thread (LPVOID DATA) {// This Thread Function Simply Prints Out Data 50 // Times with a slight delayfor (int index = 0; index <50; index ) {printf ("% d", data); // Output A Single Character Sleep (100); // Sleep A Little To Slow Things Down} // End for Index

// Just Return The Data Sent To The Thread FunctionReturn (DWORD) DATA;

} // endprinter_thread

// main /////

Void main (void) {handle thread_handle; // this is the the handle to the threaddword thread_id; // this is the id of the three

// Start with a blank lineprintf ("/ nstarting threads ... / n");

// create the thread, IRL we would check for errorsthread_handle = CreateThread (NULL, // default security 0, // default stack Printer_Thread, // use this thread function (LPVOID) 1, // user data sent to thread 0, / / creation flags, 0 = start now. & thread_id); // send id back in this var // Now enter instin INDEX <25; Index ) {Printf ("2"); Sleep (100);} // end for index // Note That Print Statement May Get // INTERSPLICED with the output of the // thread, Very Key!

Printf ("/ nwaiting for thread to terminate / n");

// at this Point The Secondary Thread So Still Be Working, // Now We Will Wait for iTwaitForsingleObject (Thread_Handle, Infinite);

// at this point the thread shop be deadclosehandle (thread_handle);

// end with a blank lineprintf ("/ Nall Threads Terminated./N");

} // end main output example: starting threads ... 2 1 2 1 2 1 2 2 1 1 2 2 1 2 2 1 2 2 11 2 1 1 2 2 1 2 2 2 1 1 2 2 1 1 2 2 1 1 2 2 1 1 2 2 1Waiting for Thread to Terminate1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1Ads Terminated. This program is simple. Usually enter the print loop after creating from the thread. When these termination is terminated, the function waitforsingleObject () is called. If there are other jobs to do if the primary thread is done, then proceed. However, in this example, the main thread does not have other tasks, so it is directly entering the waiting state. If you run Sysmon.exe before running the program, you will see that the CPU's usage is extremely low after entering the waiting state, and the CPU is occupied in a busy cycle. Here, the function waitforsingleObject () has a skill. If you want to know a thread when calling the function, you can implement the WaitForsingleObject () function with NULL, the source code is as follows: //...code

DWORD state = WaitForSingleObject (thread_handle, 0); // get the status // test the statusif (state == WAIT_OBJECT_0) {// thread is signaled, ie terminated} else if (state == WAIT_TIMEOUT) {// thread is still Running} // ... Code is simple, this is a wonderful method that detects if a specific thread has ended. Combining this method using global termination messages is a very reliable way to terminate threads. At the same time, the method is to detect whether a thread has terminated in a real-time cycle without having to enter a wait state. Waiting for multiple object issues almost all solved. The last function of the wait * () class function is a function that is used to wait for multiple objects or thread signals. We now try to write programs that use this function. What we have to do is create a thread array and then pass the handle array to the waitformultiplebjects () function together with several parameters. When the function returns, if everything is normal, then all threads should have terminated. Demo11_9.cpp | EXE is similar to Demo11_8.cpp | EXE, but it is created multithreaded, then the main thread is waiting for all other threads to terminate. Here, do not use global termination signs because you know how to achieve this. Each will terminate after several cycles running from the thread. The source code of Demo11_9.cpp is as follows: // Demo11_9.cpp -an eXample use of // WaitFormultiPleObjects (...)

// incrudes ///

#define Win32_Lean_and_mean // Make Sure Certain Headers // Are INCLUDED CORRECTLY

#include

// include the standard Windows Stuff

#include

// incrude the 32 bit stuff

#include

// defines ////

#define max_threads 3

// Prototypes /

DWORD WINAPI Printer_thread (LPVOID DATA); // Globals

// functions //

DWORD WINAPI Printer_thread (LPVOID DATA) {// this Thread Function Simply Prints Out Data 50 // Times with a slight delayfor (int index = 0; index <50; index ) {printf ("% d", (int) Data 1); // Output a Single Character Sleep (100); // Sleep A Little To Slow Things Down} // End for Index

// Just Return The Data Sent To The Thread FunctionReturn (DWORD) DATA;

} // endprinter_thread

// main

void main (void) {HANDLE thread_handle [MAX_THREADS]; // this holds the // handles to the threadsDWORD thread_id [MAX_THREADS]; // this holds the ids of the threads // start with a blank lineprintf ( "/ nStarting all threads ... / n ");

// Create The Thread, IRL We Would Check for Errorsfor (int index = 0; index

// now Enter Into Printing Loop, // Make Sure This Takes Less Time Than THREADS / / SO IT Finishes Firstfor (INDEX = 0; Index <25; Index ) {Printf ("4"); SLEEP (100);} // end for index

// now wait for all the threads to signal terminationWaitForMultipleObjects (MAX_THREADS, // number of threads to wait for thread_handle, // handles to threads TRUE, // wait for all INFINITE?); // time to wait, INFINITE = forever

// at this Point The Threads Should All Be Dead, So Close Handlesfor (INDEX = 0; Index

CloseHandle (thread_handle [index]);

// end with a blank lineprintf ("/ Nall Threads Terminated./N");

} // end main output example: Starting All threads ... 4 1 2 3 4 1 2 3 1 4 2 3 2 4 1 3 1 4 2 3 2 4 1 31 4 2 3 2 4 1 3 1 4 2 3 2 4 1 3 1 4 2 3 2 4 1 31 4 2 3 2 4 1 3 1 4 2 3 2 4 1 4 2 3 2 4 1 31 4 2 3 2 4 1 3 1 4 2 3 2 4 1 3 1 4 2 3 2 4 1 31 4 2 3 2 1 3 2 1 3 2 1 3 2 1 3 2 1 3 2 1 3 2 1 3 2 1 3 2 1 3 2 1 3 2 13 2 1 3 2 1 3 2 1 3 2 1 3 2 1 3 2 1 3 2 1 3 All Threads Terminated. Output is as expected. All threads and main threads are in the same print output, but when the cycle of the main thread is completed, the secondary thread will continue until they complete it. Due to the blocking effect of the function waitformultipleObjects (), only when all threads are finished, the main line is terminated. Multi-threaded and DirectX now we have a certain understanding of multithreading. The next problem is how to apply it to the game program and DirectX programming. Let go - now there is everything you need. Of course, it is necessary to ensure that multi-thread libraries are compiled instead of single-line libraries. Also, when we handle a large amount of DirectX resources, we will meet the problem of critical section. There is a global plan to prevent more than one thread from accessing the same resource. For example, assume that a thread locks a surface and the other running thread attempts to lock the same surface. This will cause problems. Such problems can be solved using sempahore, mutex, and critical regions. Here I can't discuss in detail one by one. But you can check the appropriate information, such as publishing, Jim Beveridge, and Robert Weiner, "Multithreading Applications in Win32". This is about the best reference book in this area. To implement such resource management procedures and share threads, we need to create a variable to track other threads using the resource. Any thread that needs to use the resource must be detected, and it can be used. Of course, this is still a problem unless this variable can be modified by the atomically because you may be modifying a variable to half the other threads just get control. Such variables can be set to the Volatile type to minimize the probability of problems, so that the compiler does not make memory copies. However, eventually, you have to use the semaphore (Semaphores, a set of a simple to global variable, but is in the form of the basic assembly code that cannot be interrupted), mutex (Mutex, only one thread access) The critical area is a binary selection. The critical section (Critical Section), only one thread can only be allowed once when compiling Win32 calls is compiled). On the other hand, if the functions of each thread are relatively independent, it is not necessary to consider these. Demo11_10.cpp | EXE is a DirectX instance that applies multithreaded programming (its 16-bit version is Demo11_10_16b.cpp | exe), readers can check the program.

转载请注明原文地址:https://www.9cbs.com/read-283.html

9cbs

New Post(0)