Anyone here any good at programming?
I can't really find any alternate methods that seem better to me. What exactly is the drawback of using linked lists again? Actually... I just now got an idea. What if I had a seperate(smaller) hash table to handle the collisions. Obviously I'd have to use a different algorithm for the other hash table, but I think sdbm or whatever was pretty close to as good as rs with my test strings. I think If I did this, I could get it so that it rarely needs to do more then 2-3 iterations to find the player.
_________________
2101729 Kalantir-Bar-Orc-Mal-Cha escaped the dungeon
There are parts of your problem I don't understand, but I see a possible solution.
Instead of inserting and deleting users as they come & go, why don't you just insert & delete them as they subscribe/unsubscribe. Then just change their "present/absent" status as they log on or off. That way you could use your binary search and only act on the found user if his/her logon status is "present."
My Class would a bit this way
class Player
{
private
//Player data
Player *pNextPlayer,
*pBeforePlayer;
public:
Player (//Playerdata
Player *pBeforePlayer)
Player *MakeNext (//Playerdata)
...
}
With Coverclass AllPalyer:
class AllPlayer
{
private: Player *pCurrentPlayer,
*pPlayerListStart;
//
----
You can also use the std-Template List - Such lists very flexible to store diffent amounts of data. You just had to be a bit carefull with heap-administration and the pointer handling.
You need to sort the list dynamically during the build up process via extra pointers - For search you can than approximate the entry point for the search via list of pointers to the potential entry points ans search than along the sorted pointers - because you never will run fully through the a small recursion is here acceptable and makes live easier for the programmer.
You need to sort the list dynamically during the build up process via extra pointers - For search you can than approximate the entry point for the search via list of pointers to the potential entry points ans search than along the sorted pointers - because you never will run fully through the a small recursion is here acceptable and makes live easier for the programmer.
Please explain what you mean by "approximate the entry point for the search".
In any case, this does not work. It suffers from an O(n) lookup time.
For a fully compact table (no wasted space), an AVL tree will always give O(ln(n)) time.
With a reasonable hash, and a table that is not too full, the same O(ln(n)) can be achieved, but with a smaller constant, and one can approach O(1), by making the table very sparse.
_________________
"Striking up conversations with strangers is an autistic person's version of extreme sports." Kamran Nazeer
Please explain what you mean by "approximate the entry point for the search".
In any case, this does not work. It suffers from an O(n) lookup time.
For a fully compact table (no wasted space), an AVL tree will always give O(ln(n)) time.
With a reasonable hash, and a table that is not too full, the same O(ln(n)) can be achieved, but with a smaller constant, and one can approach O(1), by making the table very sparse.
Consider the following Code
Class Data
{
private:
int A;
//Any other Data
Data *pNext, pBefore, //mere administration
*pSearchCriteriumBefore_A, *pSearchCriteriumBefore_A;
public:
...
}
A second Class
Class DataPointer
{
DataPointer *pBefore,
*pNext;
Data *pData;
int A;
...
}
a 3rd Class:
Class DataPointerPointer
{
DataPointerPointer *pBefore,
*pNext;
Data *pData;
int A;
...
}
And a cover-class:
class DataCover
{
DataPointer *pDataPointerStart, *pDataPointerCurrent;
DataPointerPointer *pDataPointerPointerStart, *pDataPointerPointerCurrent;
Data *pStart, pCurrent;
}
---
If a new Element of the Type Data is added it will be added at the end of the List starting with pStart. Than the value A will read in, it will be looked up in the pDataPointer-List, a near by element represented. Than it will run from this element along the pSearchCriterium pointer till it fits its place, the pSearchCriterium-list will be opened the new element insert. If the gap in between the elements represented in pDataPointerStart-List (each 100th or element) that an new element in DataPointer-List will be introduced.
So you work with List of entry-points pointing to place close to the assumed right place, check at first ot this "entry point" for the search.
At an example: We have list of 1'000'000 and create a number of the and of 100 and 100 search-entries: So looking up in the list of pointers-to-pointers would take in the worst case 100 steps, if we have the mean of the list we can decide to start in the middle, or at the ends. The first search would need in the worst case 50 steps, the second too and 3rd also, so we would end up with 150 step. If we make a higher number lists-of-list and raise those dynamic than the O(n) can be reduced further.
The idea is not that far from a search tree, and the performance is slightly worst, but I used this occasionally to gain by-the-way statistically data - and I recycled this structure for search proposes. iIt can be reasonable set-up with templates without defining endless new classes (when I am ugly, even with pointer to void).
I'm unable to follow why you would want to replace a simple binary tree of 1,000,000 items, which may be searched in a maximum of 20 comparisons, with some linear search that you say takes "worst case 100 steps", but I strongly doubt that, as you have given no indication of what the distribution of your "int A" might be. What if the values presented for A were the numbers 0...999,999?
The problem under discussion was for 30,000 strings to be looked up, not 1,000,0000 ints.
Assuming that a good hash function on the strings gives a result in the range 0..65,535 for instance, that is then directly used to pick up a pointer to what will frequently be the correct item. One can make the number of items searched get arbitrarily close to unity.
Everything I said so far has been straight from memory - and having done quite a lot of sorting and searching. Personally, I even like "Bubble Sort"... in its place.
The Wikipedia article has more detail
http://en.wikipedia.org/wiki/Hash_table
As is often the case (see my Bubble Sort comment above), people shy away from techniques too quickly, at times. E.g. the graph here shows that mere linear probing keeps to an average of a single extra match (i.e., instead of merely finding the item, it averages at having to look at one other, per lookup), up until a load factor of nearly 80%.
So, for 30,000 items, a hash table of 40,000 spaces, and then a linear probe, takes an average of two comparisons.
For 1,000,000 items, a hash table of 1,300,000 spaces will similarly avare at about two comparisons.
=============
PS. If you wish to present code, please wrap it up with [code] [/code] tags, as the indentation (and other white space) will be preserved.
_________________
"Striking up conversations with strangers is an autistic person's version of extreme sports." Kamran Nazeer
Ok, so, I've decided that for now, I'm going to stick with the linked list to handle my collisions. If it proves to be a problem for some reason or another, I'm going to set it up so that a collision triggers the creation of a new hash table of half the size as the previous one. for example.. my main table is going to be of size 2^16, if a collision happens(which it will) a new table gets created of size 2^15 and the collision will go into there using a different algorithm to hash it. if a collision happens in that table, a new hash table(we'll refer to these as levels of collision, this would be the second level of collisions) gets created of size 2^14 and it goes into there with yet another algorithm. I know of at least 8 different algorithms which are efficient and give fairly good spread. If I end up with so many collisions that I get to a table of size 2^9, then all future ones will just be 2^9. Tell me what you think of this idea. I haven't seen any articles that talk about using a method like this before. I got the idea from double hashing, and just took it a step further. Also. I currently have it set up as 2 tables. I found out that (at least with epoll) file descriptors get assigned starting at 5, incrementing by 1 for each new connection, reusing old fd's that aren't connected anymore first. So, I created another table so I can look up players by FD. I can do this at O(1). Basically, When I call PlayerTable.Add(Player); it hashes their name, sticks them in the main table, then takes their FD and sets fdTable[FD]; to point to the same location. This way I can look them up by FD and I only need to look them up by name if thats the only info about them I have available.
I never did answer this. The reasons are because 1. I find networking on linux to be easier. and the main reason 2. because linux has epoll and windows is stuck using poll or select which are horrible by comparison when handling large numbers of file descriptors as you can see by this benchmark image
_________________
2101729 Kalantir-Bar-Orc-Mal-Cha escaped the dungeon
t0
Veteran
Joined: 23 Mar 2008
Age: 51
Gender: Male
Posts: 726
Location: The 4 Corners of the 4th Dimension
Win32 uses IoCompletionPorts for efficient network processing. You don't use polling at all.
Win32 uses IoCompletionPorts for efficient network processing. You don't use polling at all.
I don't suppose you have a link to a decent article about it I could read?
_________________
2101729 Kalantir-Bar-Orc-Mal-Cha escaped the dungeon
A plain google gives Microsoft TechNet's fairly useless overview:
http://technet.microsoft.com/en-us/sysi ... 63891.aspx
followed by a somewhat more useful article, by the look of it:
http://xania.org/200807/iocp
which gives a link to the MSDN documentation, but calls it "rubbish".
I don't see anywhere that says which version of Microsoft NT-based OS you have to buy, in order to have access to the calls. I seem to recall that each time I tried to use the more subtle parts of Microsoft APIs, it turned out that they were only available on something that cost extra.
_________________
"Striking up conversations with strangers is an autistic person's version of extreme sports." Kamran Nazeer
t0
Veteran
Joined: 23 Mar 2008
Age: 51
Gender: Male
Posts: 726
Location: The 4 Corners of the 4th Dimension
Win32 uses IoCompletionPorts for efficient network processing. You don't use polling at all.
I don't suppose you have a link to a decent article about it I could read?
http://msdn.microsoft.com/en-us/magazine/cc302334.aspx
It's an article but has a sample that might get you started.
IoCompletionPorts have been around since NT 3.5 (or longer) and made it into the home-user versions of Windows starting with XP. I've presonally used completion ports to handle hundreds of thousands of simultaneous connections (you need multiple local ip addresses for this since port numbers are limited to 16 bits).
I believe Microsoft's C# implementation of Socket is built ontop of IoCompletionPorts as well. You can call Socket.BeginReceive() or BeginSend() to provide your buffer and a callback function. The lower level code will execute your callback function when new data is available or data has been written.
Thanks guys. I've already read some of that but it didn't quite tell me everything I was wanting to know. I very likely wont be using it since I'm already well into my project on linux... In fact, I'm far enough along at this point to where I need to start doing more work on my client before continuing on the server. Just curious though... Do you know how efficient it is in comparison to epoll or kqueue? I've tried looking it up, but couldn't find any sort of comparison at all.
_________________
2101729 Kalantir-Bar-Orc-Mal-Cha escaped the dungeon
t0
Veteran
Joined: 23 Mar 2008
Age: 51
Gender: Male
Posts: 726
Location: The 4 Corners of the 4th Dimension
I don't think they're directly comparable across different OS. You'd have to build similar systems that process messages and then measure them on comparable hardware. Last I read, MS recommends 2 threads per processor when using completion ports which means you have to handle syncronization issues. I would imagine you'd have similar issues if you ran multiple threads calling epoll() as well.
Blake_be_cool
Veteran
Joined: 6 May 2008
Age: 28
Gender: Male
Posts: 860
Location: Australia, NSW, Sydney
Similar Topics | |
---|---|
Grace Hopper - Pioneer of Computer Programming |
24 Oct 2024, 10:47 pm |
not good enough |
03 Oct 2024, 5:58 pm |
Some good news... |
24 Nov 2024, 8:32 pm |
Are you a good friend |
01 Dec 2024, 8:03 pm |