John Machin
2/26/2008 8:30:00 PM
On Feb 27, 6:28 am, Lytho...@gmail.com wrote:
> Hi All,
>
> I have a python utility which helps to generate an excel file for
> language translation. For any new language, we will generate the excel
> file which will have the English text and column for interested
> translation language. The translator will provide the language string
> and again I will have python utility to read the excel file target
> language string and update/generate the resource file & database
> records. Our application is VC++ application, we use MS Access db.
>
> We have string table like this.
>
> "STRINGTABLE
> BEGIN
> IDS_CONTEXT_API_ "API Totalizer Control Dialog"
> IDS_CONTEXT "Gas Analyzer"
> END
>
> STRINGTABLE
> BEGIN
> ID_APITOTALIZER_CONTROL
> "Start, stop, and reset API volume flow
> \nTotalizer Control"
> END
> "
> this repeats.....
>
> I read the file line by line and pick the contents inside the
> STRINGTABLE.
>
> I want to use the regular expression while should give me all the
> entries with in
> STRINGTABLE
> BEGIN
> <<Get what ever put in this>>
> END
>
> I tried little bit, but no luck. Note that it is multi-line string
> entries which we cannot make as single line
>
Looks to me like you have a very simple grammar:
entry ::= id quoted_string
id is matched by r'[A-Z]+[A-Z_]+'
quoted_string is matched by r'"[^"]*"'
So a pattern which will pick out one entry would be something like
r'([A-Z]+[A-Z_]+)\s+("[^"]*")'
Not that using \s+ (whitespace) allows for having \n etc between id
and quoted_string.
You need to build a string containing all the lines between BEGIN and
END, and then use re.findall.
If you still can't get it to work, ask again -- but do show the code
from your best attempt, and reduce ambiguity by showing your test
input as a Python expression e.g.
test1_in = """ ID_F "fough"
ID_B_
"barre"
ID__Z
"zotte start
zotte end"
"""