DJ Dharme
10/9/2008 1:01:00 PM
On Oct 9, 2:16 pm, Maxim Yegorushkin <maxim.yegorush...@gmail.com>
wrote:
> On Oct 9, 5:56 am, DJ Dharme <donjuandharmap...@gmail.com> wrote:
>
>
>
> > On Oct 8, 7:20 pm, Maxim Yegorushkin <maxim.yegorush...@gmail.com>
> > wrote:
>
> > > On Oct 8, 2:16 pm, DJ Dharme <donjuandharmap...@gmail.com> wrote:
>
> > > > On Oct 8, 5:17 pm, Maxim Yegorushkin <maxim.yegorush...@gmail.com>
> > > > wrote:
>
> > > > > On Oct 8, 10:20 am, DJ Dharme <donjuandharmap...@gmail.com> wrote:
>
> > > > > > Does anybody know about a solid implementation of a map or a set
> > > > > > with dynamic order statistics which allows the user to access a node
> > > > > > by its Index in log(N) time. I have a problem of sorting a huge data
> > > > > > set and showing it row by row. I have to quickly jump to a new
> > > > > > starting point on demand. If I use the std set class it takes so many
> > > > > > iterations to jump to a new location since we have to increment the
> > > > > > iterators until we get the correct row no.
>
> > > > > Can you not use a sorted vector? Accessing elements by key is
> > > > > O(lg(N)), by index is O(1) and there is no memory overhead compared to
> > > > > std::set/map.
>
> > > > Thanks for the reply, yes I have tested this with vectors but failed
> > > > since I have to dynamically change (add, update, remove) the records.
> > > > The record count can grow up to few millions. And when the vector
> > > > starts to re-allocate memory the program hangs for some time.
>
> > > []
>
> > > In this case you can use std::deque<> to avoid paying for reallocating
> > > and moving many elements.
>
> > > > Also I have to sort the records on each update. So if I
> > > > use a vector large amount of items will be moved back and forth due to
> > > > insertions and removals.
>
> > > To insert an element in a sorted vector or deque you do lower_bound
> > > followed by insert. No need to sort again the whole container.
>
> > > --
> > > Max
>
> > Thanks Max, I forgot to mention that I have used a binary search to
> > insert items into the sorted vector. I never knew that I can use
> > lower_bound for this.
>
> > Just a small question, according to the documents the iterators in the
> > deque are getting invalidated when we insert new items to it. I am
> > currently having a map of iterators with the primary key of the
> > records as the key. This allows me to directly find the iterators to
> > the records without iterating through the set to find the correct
> > record. I am giving a sample code to make my points clear.
>
> > #include <map>
> > #include <set>
> > #include <string>
>
> > struct keycomp
> > {
> > bool operator()(const char* zLeft, const char* zRight)
> > {
> > return strcmp(zLeft, zRight) < 0;
> > }
>
> > };
>
> > class TableRow
> > {
> > public:
> > TableRow();
> > ~TableRow();
>
> > std::string s_Key;
> > void* p_Data;
> > bool b_HasChanged;
>
> > };
>
> > class TableModel;
>
> > class TableRowComparator
> > {
> > public:
> > TableRowComparator(TableModel* pModel):p_Model(pModel){}
> > ~TableRowComparator(){}
>
> > bool operator()(TableRow* pLeft, TableRow* pRight)
> > {
> > return p_Model->CompareRows(pLeft, pRight);
> > }
>
> > TableModel* p_Model;
>
> > };
>
> > typedef std::set<TableRow*, TableRowComparator> TABLE_ROW_SET;
> > typedef std::map<const char*, TABLE_ROW_SET::iterator, keycomp>
> > TABLE_ROW_ITR_MAP;
>
> > class TableModel
> > {
> > public:
> > TableModel(): o_RowComparator(this), set_TableRows(o_RowComparator)
> > {
>
> > }
>
> > // Compare rows according whatever the way the user wants
> > //
> > bool CompareRows(TableRow* pLeft, TableRow* pRight)
> > {
> > //Do whatever comparison here
> > return pLeft->p_Data - pRight->p_Data < 0;
> > }
>
> > // This method adds a record from the table control, if a record
> > exists for
> > // the same key, it will update it with the new data
> > //
> > void AddData(const char* zKey, void* pData)
> > {
> > TABLE_ROW_ITR_MAP::iterator itrMap = map_TableRowItrs.find(zKey);
>
> > TableRow* pRow = NULL;
>
> > if(itrMap != map_TableRowItrs.end)
> > {
> > TABLE_ROW_SET::iterator itrSet = itrMap->second;
>
> > pRow = *itrSet;
> > set_TableRows.erase(itrSet); // Erase the existing row since
> > // new one may be inserted to a
> > // different place
> > map_TableRowItrs.erase(itrMap);
>
> > }
> > else
> > {
> > pRow = new TableRow;
> > pRow->s_Key = zKey;
> > }
>
> > pRow->p_Data = pData;
> > pRow->b_HasChanged = true;
>
> > map_TableRowItrs.insert(pRow->s_Key.c_str(),set_TableRows.insert(pRow).first);
>
> > }
>
> > // This method removes a record from the table control
> > //
> > void RemoveData(const char* zKey)
> > {
> > TABLE_ROW_ITR_MAP::iterator itrMap = map_TableRowItrs.find(zKey);
>
> > if(itrMap != map_TableRowItrs.end)
> > {
> > TABLE_ROW_SET::iterator itrSet = itrMap->second;
>
> > TableRow* pRow = *itrSet;
> > set_TableRows.erase(itrSet);
> > map_TableRowItrs.erase(itrMap);
>
> > delete pRow;
> > }
> > }
>
> > // This method needs to access the set items by its index to show it
> > in a
> > // virtual list control on the Front-End
> > //
> > void* GetData(int iIndex)
> > {
> > // How to make this fast?, this is my question
> > // I am currently doing a caching mechanism which is not mentioned
> > here
> > // if I have a set with dynamic order statistics this can be taken
> > in Log(N) time
> > // instead of the N time
>
> > if(iIndex < set_TableRows.count())
> > {
> > TABLE_ROW_SET::iterator itrSet = set_TableRows.begin();
> > for (int iRow = 0 ; iRow < iIndex ; ++iRow, ++itrSet);
>
> > return (*itrSet)->p_Data;
> > }
>
> > return NULL;
> > }
>
> > protected:
> > TableRowComparator o_RowComparator;
> > TABLE_ROW_SET set_TableRows;
> > TABLE_ROW_ITR_MAP map_TableRowItrs;
> > };
>
> Now I see that you container actually has three indexes:
> 1) TableRow::s_Key.
> 2) TableRow::p_Data.
> 3) array index
>
> As the first step I would try getting rid of index 3. Most grid
> controls allow associating user data (a pointer) with a cell. You
> could associate a pointer to TableRow with a row in a grid. Another
> way is that you could extract the key from the GUI row and use that
> instead of index.
>
> Next you could use boost::multi_index container which allows having
> multiple indexes in the same container. This way you don't need to
> maintain an additional map of iterators/pointers for every additional
> index. The code is simpler, more robust and less memory waste.
>
> --
> Max
>
> --
> Max
Thank you very much for the guidance. But I can't go for your first
solution since I am driving a Front-End application from a Back-End.
What I get from the front end is the visible row range (ex:- 20 to
100) so that the only way I have to decide the data to be sent is by
its row index.
I think I can use the boost multi index container as a bi-directional
map (Example 4). Thanks again for the wonderful job that you are
doing.
DJD