[LC++][OT] Which structure to choose

Mon Mar 5 10:59:09 UTC 2007

Julien Claassen wrote:
>   I'm still working on a console-based UI-library and I wonder in which 
> structure to store my screen-objects. Each object derives from a common base.
>   

[puzzled]

>   I want to be able to do the following:
>   1. Probably search through all elements for a given word or whatever.
>   2. move down/up left/right. So If i'm in row 1/column 13 and press cursor 
> down, the cursor is positioned on the element, which is placed in row 2/column 
> 10.
>   

It sounds like you are describing a text widget.

>   Any good idea, an example, where someone had to solve a similar problem?
>   Here are a few of my own first thoughts:
> 1. Take a vector of vectors. The first "Big Vector" represents the list of 
> rows. The second "Small vector" represent one row. The vectors could be 
> allocated statically at program startup. So the system tell me 25 rows/80 
> columns I tell the first vector to reserve space for 25 elements and the 
> second for 80 elements. I wonder if this is good practise, for it's a bit of a 
> memory waste.
>   

Ok, then we do seem to be talking about text widgets...

There are a number of different approaches you can take on text buffers 
- which is what you are describing - many of which have different 
features with their own disadvantages and advantages. Below is my take, 
others may offer different and equally valid approaches.

First, you need to separate the idea of how text is handled within your 
program and how it is *rendered* on your user interface and be careful 
not to mix the two to avoid unnecessary complexity and overhead. Your 
text objects should not contain any rendering information nor know 
anything about the renderer other than, perhaps, their existence. 
Rendering objects are typically some type of "window", "frame", "panel" 
or similar that knows how to render text, but don't actually contain the 
text. I suspect your equivalent would be a "screen".

On the rendering side you will need information such as "current column" 
and "current line" (where the cursor is), "number of displayed lines", 
"line at the top" and "display width" and possibly others. Some of these 
need to be adjusted with almost every operation, even if the operation 
does not affect the text. Handling these does not require changing how 
your text is stored or assume anything about its format, size or other 
attributes, and for performance reasons if nothing else, you don't want 
to have to change anything about the text, for example, every time the 
user moves the insert cursor.

Rendering is an operation on the text itself and this is not where the 
text itself should be stored - rendering is an operation that is 
*applied* to text, and things like insert cursor movement only affect 
rendering when it moves out of your viewport (if it can). The text 
objects themselves would usually store the text as a block of data 
rather than as a vector/list or as array of lines (or pointers/reference 
to lines). Some editors do use the vector approach, but from my own 
experience this only lends itself to line-based operations and is 
somewhat limiting when it comes to search. Fast text search will usually 
perform best on a contiguous block (e.g. boyer-moore search for text on 
a block is blindingly fast). If you do extended regular expression 
searching like pcre, you can even match data that crosses line 
boundaries, something that is made much more difficult if the text is 
split into lines, plus you will need to iterate through every line.

On the text handling side, you might want to check out the source for 
the editor joe, which uses what it calls the "gap buffer" algorithm that 
greatly reduces reallocation and moving of text about in memory for the 
most common editor operations (inserting, deleting, appending blocks of 
text). I haven't yet found a better algorithm for most cases and it only 
suffers performance problems with extremely large amounts of text, but 
you get that with almost any non-vector approach.

All your renderer should need is

    a) knowing what to render, ie the text object

    b) attributes of the display (lines/width)

    c) offset into text of the top/left coordinate
       (assuming left-right text direction) which indirectly gives you 
the current
        line/column so you just need some way to quickly convert between 
line/
       column position and corresponding text offset)

    d) current cursor position (which also gives you the current cursor 
line/column
        number via the same conversion

Hope this helps,
David