Mike Curry
|
| Posted: 03/20/2003, 9:32 AM |
|
This is a request for opinions on the best way to handle "paging" records from a
Mysql database.
I have a database that is mostly text (except for identifiers and external file names). Each record is one of an ordered sequence, and there are many sequences in each table. (These are simple tutorials, organized as one record = one page of html). The idea is for the user to sequence through the records by clicking on a "Next page" button that increments the record counter, and each click produces a new html page. Record fields are sized quite nicely to assemble on one screen at 800x600 resolution.
For now, I am using a result set that returns all records of a topic, ordered by sequenceID. mysql_fetch_array with mysql_num_rows gives me the total record count, and I can sequence by returning a submit or something javascriptish (haven't done much investigation there yet) with the "Next" button. My doubts arise over efficient memory usage in Apache server, php, and mysql. These result sets average 30Kb, but can go as high as 256Kb. With multiple users, how much overhead am I creating with memory allocation? How do I decide on whether it's worth it to trade-off by querying for one record instead of the set? In that regard, I could generate a table containing the num_rows value for each topic.
Opinions wanted. If you like, you can respond by email tomy3rs@direcway.com
Thanks, Mike
|
|
|
 |
bsiler
|
| Posted: 03/20/2003, 9:54 PM |
|
Reading your question it leads me to ask, What are you really concerned with?
1. Code performance
2. Resource allocation (memory, disk, network bandwidth)
3. System contention (service time, demand, utilization, throughput and
response time)
You may say, "Well all of it!". What you are asking is all relative in the world of computing. A good coder will create effecient code, a good network administrator will balance traffic and manage bandwidth, a good PC tech will order a system that grow with your needs and maintain his server in good order.
If everyone does their job everything will work. I know, what dream environment am I in? Trying to make a point. When you pose a question ensure you know your "Goal". Think about it.
When I began coding years ago I used only 128 K of memory to run a topographical program on a Response 280 Mainframe. How? I came down to good coding techniques and using learned techniques and the new features of the language, much like those found in PHP and MySql. So until you can provide me with what your goal is (and it's not how to query a database) I will try to focus on what I feel will help resolve one of the problems of your goal.
Obviously, the time and memory waste increases as the data types become more complex; so I would look at the following code techniques and how to employ them:
1. Reference Counting
2. Automatic Recource alloction
3. Multi-Dimensional Arrays
What is Reference Counting?
Reference counting is a new mechanism in the PHP 4 Zend engine for eliminating memory allocation and copying of a value when it is assigned to a new variable. Instead, the new variable points to the same memory location for that value. In addition, reference counting keeps track of how many variables are currently pointing to the particular memory location for that value: when no variables point to that value, its memory location is freed.
The following example demonstrates this behavior:
$cat = 5;
$dog = $cat;
In the first line of code, $cat is bound to the value 5, and the reference count (the counter that keeps track of how many variables point to this specific value) for this value of 5 is set to 1. This way the Zend engine knows how many variables are bound to that specific value.
In the second line of code, $dog is also bound to the same value of 5 that $cat is bound to, that is, it is bound to the same memory location. As a result of $dog becoming bound to this value of 5, the reference count for the value is incremented from 1 to 2.
Now you're probably asking yourself: What happens if I assign a new value, for instance, the number 3, to $cat? Does $dog change as well?
The answer is: No, $dog does not change. The Zend engine detaches $cat from its previous value (5) that it shared with $dog, decrement the reference count for that value to 1, and create a new value (3) for $cat having an initial reference count of 1. By the same token, if $dog is then assigned a new value, say, 9, the reference count for the old value of 5 decrements to 0, and the memory location for that value of 5 is freed.
Note: PHP 4 has a feature for explicitly telling Zend that two variables really are aliases for one other, so that changing one variable does change the other.
Advantages of Reference Counting: Memory Consumption Savings
Almost no memory is consumed when assigning an existing variable to a new variable, because the assigned value of the existing variable is not copied. The new variable is simply set to point to the memory location of the value of the existing variable. The reduction in memory consumption is proportional to the size of the assigned value.
Advantages of Reference Counting: Performance Improvements
Almost no time is consumed when assigning an existing variable to a new variable, because the assigned value of the existing variable is not copied: the value that is assigned to the new variable merely has its reference counter incremented. Again, the improvement in performance is proportional to the size of the assigned value.
Automatic Resource Control
PHP 4 introduces a new data type called a resource. Many of the extension modules - for instance, the MySQL, Oracle, and XML modules - have been modified to return resource handles as their results. These resources (such as SQL query results) are reference counted, just like other data types in PHP. This means that these resources are automatically freed by PHP 4 when they are not referenced anymore, that is, when their reference count decrements to 0.
Consider the following PHP 3 code, where sql_get_row(), query_result(), and query_free() represent standard SQL functions:
while( $row = sql_get_row($result_handle))
{
print query_result($row, "name");
query_free($result_handle);
}
Often, a PHP 3 programmer may forget to write the line of code for query_free($result_handle). At run time, therefore, a new memory location is used for storing each new value of $row (which is actually an SQL resource handle) - this is a proven way to quickly run out of memory.
In PHP 4, however, every value has a reference counter, so that memory locations for the old SQL resources of $row are automatically freed. The same code as above - without the line of code for query_free($result_handle) - in PHP 4 functions as follows:
while( $row = sql_get_row($result_handle))
{
print $row;
}
When $row is first assigned a value (an SQL result handle), PHP 4 sets the reference counter for that value to 1.
When $row is assigned a new value on the next pass through the loop, the reference counter for that new value is also set to 1. However, since $row no longer points to its previous value, PHP 4 therefore decrements the reference counter of the previous value from 1 to 0.
In PHP 4, once a reference counter of a value becomes 0, its memory location is automatically freed. As a result, memory locations for only two values (two SQL result handles) of $row are all that is ever needed.
Multi-Dimensional Arrays
The savings in memory consumption and copying time become greater when the data type involved is a multi-dimensional array. Consider the following multi-dimensional array:
$personnel = array(array("first name" => "Tom", "age" => 44),
array("first name" => "Dick", "age" => 33),
array("first name" => "Harry", "age" => 22));
$hired_personnel = $personnel;
This variable assignment in PHP 4 causes $hired_personnel to point to the memory location containing the array that $personnel points to, and increments the array's reference count. (In PHP 3, the assignment to $hired_personnel would require making a new copy in memory of the array that $personnel points to.)
If $personnel were a 500 by 20 array, the saving would exceed 10,000 memory locations. If the average data element used 10 bytes, that would be over 100 kilobytes of memory saved, and a proportional amount of processing time saved by not copying those 100 kilobytes.
So Mike as you can see, good coding techniques and the ability to use only the resources needed, not what is available, will ensure your application will function as needed.
|
|
|
 |
Mike Curry
|
| Posted: 03/27/2003, 3:15 PM |
|
Interesting expose', bsiler. Thanks for the information. Since my primary concern is, in fact, your item #3, "system contention", it would seem that my best course of action is to make sensible use of variable assignment and updating, keeping in mind the "reference counting" feature of PHP 4, almost to the extent that my script can act like it's running as an application on one machine, rather than as a client/server. Although the improved handling of multidimensional arrays makes it tempting to run wide-open sessions and full fetch_array processing, I think I will be a bit more conservative, and go with a single-row retrieval, creating a separate table to store row-counts, and run that backwards and forwards (reverse sequencing is required) as a variable handed to a display_screen() function.
I never worked on mainframes, but I started out writing Z80 assembler for ROMable modules in 16K EPROMs. You had 128K on that mainframe? Wow, that's a mighty luxurious heap. :) Regards, Mike.
|
|
|
 |
|