【移動應用開發(fā)技術】4 C++ Boost 正則表達式

上傳人：m*** IP屬地：湖北上傳時間：2023-05-08 格式：DOCX 頁數(shù)：40 大?。?08.20KB 積分：12 舉報 版權申訴

已閱讀5頁，還剩35頁未讀，繼續(xù)免費閱讀

版權說明：本文檔由用戶提供并上傳，收益歸屬內容提供方，若內容存在侵權，請進行舉報或認領

文檔簡介

【移動應用開發(fā)技術】4C++Boost正則表達式

4C++

Boost正則表達式目錄:

離線文檔:

去除HTML文件中的標簽:

正則表達之檢驗程序:

正則表達式元字符:

錨點:

匹配多個字母與多個數(shù)字

標記:含有()一對小括號里面的東西,Boost中()不需要轉譯了

不被標記,不能被反向引用

重復特性[貪婪匹配,盡量去匹配最多的]:

非貪婪匹配[盡可能少的匹配]:

流模式,不會回頭,匹配就匹配了,為高性能服務:

反向引用:必須存在被標記的表達式

或條件:

單詞邊界:

命名表達式:

注釋:

分支重設:

正向預查:

舉例1:只是匹配th不是匹配ing,但是ing必須存在

舉例2:ing參與匹配,th不被消耗,in被匹配

舉例3:除了ing不匹配,其他都匹配.

反向預查:

遞歸正則:

操作符優(yōu)先級:

顯示子串的個數(shù)

boost

正則表達式

sub

match

boost

正則表達式

算法regex_replace

boost

正則表達式

迭代器

boost

正則表達式

-1,就是未被匹配的字符

boost

正則表達式

captures

官方代碼為什么會出現(xiàn)段錯誤?

boost

正則表達式

官方例子

boost

正則表達式

search方式

簡單的詞法分析器,分析C++類定義

boost

正則表達式

迭代器方式

簡單的詞法分析器,分析C++類定義

boost

正則表達式,將C++文件轉換為HTML文件

boost

正則表達式

,抓取網頁中的所有連接:離線文檔:boost_1_62_0/libs/regex/doc/html/boost_regex/syntax/perl_syntax.html去除HTML文件中的標簽:chunli@Linux:~/workspace/Boost$sed's/<[\/]\?$[[:alpha:]][[:alnum:]]*[^>]*$>//g'index.html

正則表達之檢驗程序:

chunli@Linux:~/boost$

cat

main.cpp

#include

#include

#include

using

namespace

std;

int

main(int

argc,

const

char*

argv[])

{

(argc

{

cerr

"Usage:

argv[0]

regex-str"

endl;

return

}

boost::regex

e(argv[1],

boost::regex::icase);

//mark_count

返回regex中帶標記子表達式的數(shù)量。帶標記子表達式是指正則表達式中用圓括號括起來的部分

cout

"subexpressions:

e.mark_count()

endl;

string

line;

while

(getline(cin,

line))

{

boost::match_results<string::const_iterator>

(boost::regex_search(line,

boost::match_default))

{

const

int

m.size();

for

(int

++i)

{

cout

m[i]

}

cout

endl;

}

else

{

cout

setw(line.size())

setfill('-')

'-'

right

endl;

}

} 正則表達式元字符:.[{}()\*+?|^$ 錨點:AnchorsA'^'charactershallmatchthestartofaline.A'$'charactershallmatchtheendofaline. 匹配多個字母與多個數(shù)字chunli@Linux:~/boost$g++main.cpp

-lboost_regex-Wall

&&./a.out"\w+\d+"

subexpressions:0Hello,world2016

world2016

標記:含有()一對小括號里面的東西,Boost中()不需要轉譯了chunli@Linux:~/boost$

g++

main.cpp

-l

boost_regex

-Wall

./a.out

"([[:alpha:]]+)[[:digit:]]+\1"

subexpressions:

hello123abc8888888abc

abc8888888abc

abc

\1為引用$1

只有被標記的內容才能被反向引用. ?:不被標記,不能被反向引用chunli@Linux:~/boost$

g++

main.cpp

-l

boost_regex

-Wall

./a.out

'(?:[[:alpha:]]+)[[:digit:]]+'

subexpressions:

abcd1234

11111@@

重復特性[貪婪匹配,盡量去匹配最多的]:* 任意次

+ 至少一次

? 一次

{n} n次

{n,} 大于等于n次

{n,m} n到m次

chunli@Linux:~/boost$

g++

main.cpp

-l

boost_regex

-Wall

./a.out

'a.*b'

subexpressions:

azzzzzzzzzbbaaazzzzzzzb

azzzzzzzzzbbaaazzzzzzzb ?非貪婪匹配[盡可能少的匹配]:

Non

greedy

repeats

The

normal

repeat

operators

are

"greedy",

that

say

they

will

consume

much

input

possible.

There

are

non-greedy

versions

available

that

will

consume

little

input

possible

while

still

producing

match.

Matches

the

atom

zero

times,

while

consuming

little

input

possible.

Matches

the

atom

one

times,

while

consuming

little

input

possible.

Matches

the

atom

zero

one

times,

while

consuming

little

input

possible.

{n,}?

Matches

the

atom

times,

while

consuming

little

input

possible.

{n,m}?

Matches

the

atom

between

and

times,

while

consuming

little

input

possible.

chunli@Linux:~/boost$

g++

main.cpp

-l

boost_regex

-Wall

./a.out

'a.*?b'

subexpressions:

azzzzzzzzzbbaaazzzzzzzb

azzzzzzzzzb 流模式,不會回頭,匹配就匹配了,為高性能服務:

Possessive

repeats

default

when

repeated

pattern

does

not

match

then

the

engine

will

backtrack

until

match

found.

However,

this

behaviour

can

sometime

undesireble

there

are

also

"possessive"

repeats:

these

match

much

possible

and

not

then

allow

backtracking

the

rest

the

expression

fails

match.

Matches

the

atom

zero

times,

while

giving

nothing

back.

Matches

the

atom

one

times,

while

giving

nothing

back.

Matches

the

atom

zero

one

times,

while

giving

nothing

back.

{n,}+

Matches

the

atom

times,

while

giving

nothing

back.

{n,m}+

Matches

the

atom

between

and

times,

while

giving

nothing

back.

Back

references 反向引用:必須存在被標記的表達式

chunli@Linux:~/boost$

g++

main.cpp

-lboost_regex

-Wall

&&./a.out

'^(a*).*\1$'

subexpressions:

a66a66

asssasss

asssasss 或條件:

Alternation

The

operator

will

match

either

its

arguments,

for

example:

abc|def

will

match

either

"abc"

"def".

Parenthesis

can

used

group

alternations,

for

example:

ab(d|ef)

will

match

either

"abd"

"abef".

Empty

alternatives

are

not

allowed

(these

are

almost

always

mistake),

but

you

really

want

empty

alternative

use

(?:)

placeholder,

for

example:

|abc

not

valid

expression,

but

(?:)|abc

and

equivalent,

also

the

expression:

(?:abc)??

has

exactly

the

same

effect.

chunli@Linux:~/boost$

g++

main.cpp

-lboost_regex

-Wall

&&./a.out

'l(i|o)ve'

subexpressions:

love

live

chunli@Linux:~/boost$

g++

main.cpp

-lboost_regex

-Wall

&&./a.out

'\<l(i|o)ve\>'

subexpressions:

love

live

chunli@Linux:~/boost$

g++

main.cpp

-lboost_regex

-Wall

&&./a.out

'abc|123|234'

subexpressions:

123

abc

234

123456789abc

123 單詞邊界:

Word

Boundaries

Word

Boundaries

The

following

escape

sequences

match

the

boundaries

words:

Matches

the

start

word.

Matches

the

end

word.

Matches

word

boundary

(the

start

end

word).

Matches

only

when

not

word

boundary. 命名表達式:

chunli@Linux:~/boost$

g++

main.cpp

-lboost_regex

-Wall

&&./a.out

'(?<r1>\d+)[[:blank:]]+\1'

subexpressions:

123

234

chunli@Linux:~/boost$

g++

main.cpp

-lboost_regex

-Wall

&&./a.out

'(?<r1>\d+)[[:blank:]]+\g{r1}'

subexpressions:

1234

1236

1236 注釋:

Comments

(?#

...

)

treated

comment,

it's

contents

are

ignored.

chunli@Linux:~/boost$

g++

main.cpp

-lboost_regex

-Wall

&&./a.out

'\d+(?#我的注釋)'

subexpressions:

hello1234

1234 分支重設:

Branch

reset

(?|pattern)

resets

the

subexpression

count

the

start

each

"|"

alternative

within

pattern.

The

sub-expression

count

following

this

construct

that

whichever

branch

had

the

largest

number

sub-expressions.

This

construct

useful

when

you

want

capture

one

number

alternative

matches

single

sub-expression

index.

the

following

example

the

index

each

sub-expression

shown

below

the

expression:

before

branch-reset

after

(

)

(?|

(

)

(q)

(t)

(v)

)

(

)

chunli@Linux:~/boost$

./a.out

)

(?|

(

)

(q)

(t)

(v)

)

(

)

/x'

subexpressions:

4 正向預查:即使字符已經被匹配,但是不被消耗,留著其他人繼續(xù)匹配Lookahead(?=pattern)consumeszerocharacters,onlyifpatternmatches.(?!pattern)consumeszerocharacters,onlyifpatterndoesnotmatch.LookaheadistypicallyusedtocreatethelogicalANDoftworegularexpressions,forexampleifapasswordmustcontainalowercaseletter,anuppercaseletter,apunctuationsymbol,andbeatleast6characterslong,thentheexpression:(?=.*[[:lower:]])(?=.*[[:upper:]])(?=.*[[:punct:]]).{6,}couldbeusedtovalidatethepassword. 舉例1:只是匹配th不是匹配ing,但是ing必須存在chunli@Linux:~/boost$

g++

main.cpp

-lboost_regex

-Wall

&&./a.out

'th(?=ing)'

subexpressions:

those

thing

th 舉例2:ing參與匹配,th不被消耗,in被匹配chunli@Linux:~/boost$

g++

main.cpp

-lboost_regex

-Wall

&&./a.out

'th(?=ing)(in)'

subexpressions:

thing

thin

those

舉例3:除了ing不匹配,其他都匹配.chunli@Linux:~/boost$

g++

main.cpp

-lboost_regex

-Wall

&&./a.out

'th(?!ing)'

subexpressions:

this

thing

反向預查:

Lookbehind

(?<=pattern)

consumes

zero

characters,

only

pattern

could

matched

against

the

characters

preceding

the

current

position

(pattern

must

fixed

length).

(?<!pattern)

consumes

zero

characters,

only

pattern

could

not

matched

against

the

characters

preceding

the

current

position

(pattern

must

fixed

length).

chunli@Linux:~/boost$

g++

main.cpp

-lboost_regex

-Wall

&&./a.out

'(?<=ti)mer'

subexpressions:

timer

mer

memer

chunli@Linux:~/boost$

g++

main.cpp

-lboost_regex

-Wall

&&./a.out

'(?<!ti)mer'

subexpressions:

timer

hhmer

mer 遞歸正則:(?N)

(?-N)

(?+N)

(?R)

(?0)

(?&NAME)

(?R)

and

(?0)

recurse

the

start

the

entire

pattern.

(?N)

executes

sub-expression

recursively,

for

example

(?2)

will

recurse

sub-expression

(?-N)

and

(?+N)

are

relative

recursions,

for

example

(?-1)

recurses

the

last

sub-expression

declared,

and

(?+1)

recurses

the

sub-expression

declared.

(?&NAME)

recurses

named

sub-expression

NAME. 操作符優(yōu)先級:

Operator

precedence

The

order

precedence

for

operators

follows:

Collation-related

bracket

symbols

[==]

[::]

[..]

Escaped

characters

Character

set

(bracket

expression)

[]

Grouping

()

Single-character-ERE

duplication

{m,n}

Concatenation

Anchoring

Alternation

|===========================================================Boost

regexAPI顯示子串的個數(shù)

pi@raspberrypi:~/boost

cat

main.cpp

#include

#include

#include

using

namespace

std;

int

main(int

argc,

const

char*

argv[])

{

using

boost::regex;

regex

e1;

"^[[:xdigit:]]*$";

cout

e1.str()

endl;

cout

e1.mark_count()

endl;

//regex::save_subexpression_location如果沒有打開,

e2.subexpression(0)會報錯

regex

e2("\\b\\w+(?=ing)\\b.{2,}?([[:alpha:]]*)$",regex::perl

regex::icase|regex::save_subexpression_location );

cout

e2.str()

endl;

cout

e2.mark_count()

endl;

pair<regex::const_iterator,regex::const_iterator>

sub1

e2.subexpression(0);

string

sub1Str(sub1.first,++sub1.second);

cout

sub1Str

endl;

return

}

pi@raspberrypi:~/boost

g++

main.cpp

-lboost_regex

-Wall

&&./a.out

^[[1;5D^[[:xdigit:]]*$

\b\w+(?=ing)\b.{2,}?([[:alpha:]]*)$

([[:alpha:]]*)

pi@raspberrypi:~/boost

$boost正則表達式submatch

pi@raspberrypi:~/boost

cat

main.cpp

#include

#include

#include

using

namespace

std;

int

main(int

argc,

const

char*

argv[])

{

using

boost::regex;

//以T開頭,跟多個字母

\b邊界,然后是16進制匹配

regex

e1("\\bT\\w+\\b

([[:xdigit:]]+)");//讓正則表達式看到反斜杠

string

s("Time

ef09,Todo

001");

boost::smatch

//bool

boost::regex_search(s,m,e1,boost::match_all);//:match_all只會匹配最后一下

bool

boost::regex_search(s,m,e1);//默認只會匹配首次

cout

<<endl;

const

int

m.size();

for(int

i<n;

i++)

{

cout

"matched:"

,position:"

m.position(i)

<<",

cout

"length:"

m.length(i)

str:"

m.str(i)

endl;

}

return

}

pi@raspberrypi:~/boost

g++

main.cpp

-lboost_regex

-Wall

&&./a.out

matched:0

,position:0,

length:9

str:Time

ef09

matched:1

,position:5,

length:4

str:ef09

pi@raspberrypi:~/boost

$boost正則表達式算法regex_replace

pi@raspberrypi:~/boost

cat

main.cpp

#include

#include

#include

using

namespace

std;

int

main(int

argc,

const

char*

argv[])

{

using

boost::regex;

regex

e1("([TQV])|(\\*)|(@)");

string

replaceFmt("(\\L?1$&)(?2+)(?3#)");//轉小寫,轉+,轉#

string

src("guTdQhV@@g*b*");//輸入的字符串

cout

"before

replaced:

<<src

endl;

//before

replaced:

guTdQhV@@g*b*

string

newStr1

regex_replace(src,e1,replaceFmt,boost::match_default|boost::format_all);//必須format_all

cout

"after

replaced:

newStr1

endl;

//after

replaced:

gutdqhv##g+b+

string

newStr2

regex_replace(src,e1,replaceFmt,boost::match_default|boost::format_default);//奇怪的結果

cout

"after

replaced:

newStr2

endl;

//其他的方式

ostream_iterator<char>

oi(cout);

regex_replace(oi,src.begin(),src.end(),e1,replaceFmt,boost::match_default

boost::match_all);

cout

endl;

return

}

pi@raspberrypi:~/boost

g++

main.cpp

-lboost_regex

-Wall

&&./a.out

before

replaced:

guTdQhV@@g*b*

after

replaced:

gutdqhv##g+b+

after

replaced:

gu(?1t)(?2+)(?3#)d(?1q)(?2+)(?3#)h(?1v)(?2+)(?3#)(?1@)(?2+)(?3#)(?1@)(?2+)(?3#)g(?1*)(?2+)(?3#)b(?1*)(?2+)(?3#)

guTdQhV@@g*b(?1*)(?2+)(?3#)

pi@raspberrypi:~/boost

$boost正則表達式

迭代器

pi@raspberrypi:~/boost

cat

main.cpp

#include

#include

#include

using

namespace

std;

int

main(int

argc,

const

char*

argv[])

{

using

boost::regex;

regex

e("(a+).+?",regex::icase);

string

s("ann

abb

aaat");

boost::sregex_iterator

it1(s.begin(),s.end(),e);

boost::sregex_iterator

it2;

for(;it1

it2;++it1)

{

boost::smatch

*it1;

cout

endl;

}

return

}

pi@raspberrypi:~/boost

g++

main.cpp

-lboost_regex

-Wall

&&./a.out

aaat

pi@raspberrypi:~/boost

$boost正則表達式-1,就是未被匹配的字符

pi@raspberrypi:~/boost

cat

main.cpp

#include

#include

#include

using

namespace

std;

int

main(int

argc,

const

char*

argv[])

{

using

boost::regex;

string

s("this

::a

string

::of

tokens");

boost::regex

re("\\s+:*");//匹配

boost::sregex_token_iterator

i(s.begin(),s.end(),re,-1);

boost::sregex_token_iterator

unsigned

count

while(i

{

cout

*i++

endl;

count++;

}

cout

"There

were

"<<

count

tokens

found

endl;

return

}

pi@raspberrypi:~/boost

g++

main.cpp

-lboost_regex

-Wall

&&./a.out

this

string

tokens

There

were

tokens

found

pi@raspberrypi:~/boost

$boost正則表達式captures官方代碼為什么會出現(xiàn)段錯誤?

pi@raspberrypi:~/boost

cat

main.cpp

#include

#include

void

print_captures(const

std::string&

regx,

const

std::string&

text)

{

boost::regex

e(regx);

boost::smatch

what;

std::cout

"Expression:

\""

regx

"\"\n";

std::cout

"Text:

\""

text

"\"\n";

if(boost::regex_match(text,

what,

boost::match_extra))

{

unsigned

std::cout

"**

Match

found

**\n

Sub-Expressions:\n";

for(i

what.size();

++i)

std::cout

\""

what[i]

"\"\n";

std::cout

Captures:\n";

for(i

what.size();

++i)

{

std::cout

{";

for(j

what.captures(i).size();

++j)

{

if(j)

std::cout

else

std::cout

"\""

what.captures(i)[j]

"\"";

}

std::cout

}\n";

}

else

{

std::cout

"**

Match

found

**\n";

}

int

main(int

char*

[])

{

print_captures("(([[:lower:]]+)|([[:upper:]]+))+",

"aBBcccDDDDDeeeeeeee");

print_captures("a(b+|((c)*))+d",

"abd");

print_captures("(.*)bar|(.*)bah",

"abcbar");

print_captures("(.*)bar|(.*)bah",

"abcbah");

print_captures("^(?:(\\w+)|(?>\\W+))*$",

"now

the

time

for

all

good

men

come

the

aid

the

party");

print_captures("^(?>(\\w+)\\W*)*$",

"now

the

time

for

all

good

men

come

the

aid

the

party");

print_captures("^(\\w+)\\W+(?>(\\w+)\\W+)*(\\w+)$",

"now

the

time

for

all

good

men

come

the

aid

the

party");

print_captures("^(\\w+)\\W+(?>(\\w+)\\W+(?:(\\w+)\\W+){0,2})*(\\w+)$",

"now

the

time

for

all

good

men

come

the

aid

the

party");

return

}

pi@raspberrypi:~/boost

g++

-D

BOOST_REGEX_MATCH_EXTRA

-l

boost_regex

-Wall

main.cpp

&&./a.out

Expression:

"(([[:lower:]]+)|([[:upper:]]+))+"

Text:

"aBBcccDDDDDeeeeeeee"

Match

found

Bus

error

pi@raspberrypi:~/boost

$boost正則表達式官方例子

pi@raspberrypi:~/boost

cat

main.cpp

#include

#include

<stdlib.h>

#include

#include

#include

using

namespace

std;

using

namespace

boost;

regex

expression("^([0-9]+)(\\-|

|$)(.*)$");//0-9,-

$,*三種

int

process_ftp(const

char*

response,

std::string*

msg)

{

cmatch

what;

if(regex_match(response,

what,

expression))

{

what[0]

contains

the

whole

string

what[1]

contains

the

response

code

what[2]

contains

the

separator

character

what[3]

contains

the

text

message.

if(msg)

msg->assign(what[3].first,

what[3].second);

return

::atoi(what[1].first);

}

failure

did

not

match

if(msg)

msg->erase();

return

-1;

}

#if

defined(BOOST_MSVC)

(defined(__BORLANDC__)

(__BORLANDC__

0x550))

istream&

getline(istream&

is,

std::string&

{

s.erase();

char

static_cast<char>(is.get());

while(c

'\n')

{

s.append(1,

c);

static_cast<char>(is.get());

}

return

is;

}

#endif

int

main(int

argc,

const

char*[])

{

std::string

in,

out;

{

if(argc

{

cout

"enter

test

string"

endl;

getline(cin,

in);

if(in

"quit")

break;

}

else

"100

this

ftp

message

text";

int

result;

result

process_ftp(in.c_str(),

&out);

if(result

-1)

{

cout

"Match

found:"

endl;

cout

"Response

code:

result

endl;

cout

"Message

text:

out

endl;

}

else

{

cout

"Match

not

found"

endl;

}

cout

endl;

}

while(argc

1);

return

}

pi@raspberrypi:~/boost

g++

-l

boost_regex

-Wall

main.cpp

&&./a.out

enter

test

string

404

not

found

Match

found:

Response

code:

404

Message

text:

not

found

enter

test

string

500

service

error

Match

found:

Response

code:

500

Message

text:

service

error

enter

test

string

pi@raspberrypi:~/boost

$boost正則表達式search方式簡單的詞法分析器,分析C++類定義

pi@raspberrypi:~/boost

cat

main.cpp

#include

#include

<map>

#include

purpose:

takes

the

contents

file

the

form

string

and

searches

for

all

the

C++

class

definitions,

storing

their

locations

map

strings/int's

typedef

std::map<std::string,

std::string::difference_type,

std::less<std::string>

map_type;

const

char*

possibly

leading

whitespace:

"^[[:space:]]*"

possible

template

declaration:

"(template[[:space:]]*<[^;:{]+>[[:space:]]*)?"

class

struct:

"(class|struct)[[:space:]]*"

leading

declspec

macros

etc:

"("

"\\<\\w+\\>"

"("

"[[:blank:]]*\$[^)]*\$"

")?"

"[[:space:]]*"

")*"

the

class

name

"(\\<\\w*\\>)[[:space:]]*"

template

specialisation

parameters

"(<[^;:{]+>)?[[:space:]]*"

terminate

{

"(\\{|:[^;\\{()]*\\{)";

boost::regex

expression(re);

void

IndexClasses(map_type&

const

std::string&

file)

{

std::string::const_iterator

start,

end;

start

file.begin();

end

file.end();

boost::match_results<std::string::const_iterator>

what;

boost::match_flag_type

flags

boost::match_default;

while(boost::regex_search(start,

end,

what,

expression,

flags))

{

what[0]

contains

the

whole

string

what[5]

contains

the

class

name.

what[6]

contains

the

template

specialisation

any.

add

class

name

and

position

map:

m[std::string(what[5].first,

what[5].second)

std::string(what[6].first,

what[6].second)]

what[5].first

file.begin();

update

position:

start

what[0].second;

update

flags:

flags

boost::match_prev_avail;

flags

boost::match_not_bob;

}

#include

#include

using

namespace

std;

void

load_file(std::string&

std::istream&

is)

{

s.erase();

if(is.bad())

return;

s.reserve(static_cast<std::string::size_type>(is.rdbuf()->in_avail()));

char

while(is.get(c))

{

if(s.capacity()

s.size())

s.reserve(s.capacity()

3);

s.append(1,

c);

}

int

main(int

argc,

const

char**

argv)

{

std::string

text;

for(int

argc;

++i)

{

cout

"Processing

file

argv[i]

endl;

map_type

std::ifstream

fs(argv[i]);

load_file(text,

fs);

fs.close();

IndexClasses(m,

text);

cout

m.size()

matches

found"

endl;

map_type::iterator

m.begin();

m.end();

while(c

{

cout

"class

\""

(*c).first

"\"

found

index:

(*c).second

endl;

++c;

}

return

}

pi@raspberrypi:~/boost

cat

my_class.cpp

template

<class

struct

{

public:

};

template

<class

class

{

}

;

pi@raspberrypi:~/boost

g++

-l

boost_regex

-Wall

main.cpp

&&./a.out

my_class.cpp

Processing

file

my_class.cpp

matches

found

class

"A"

found

index:

class

"M"

found

index:

pi@raspberrypi:~/boost

$boost正則表達式迭代器方式簡單的詞法分析器,分析C++類定義

pi@raspberrypi:~/boost

cat

main.cpp

#include

#include

<map>

#include

#include

#include

using

namespace

std;

purpose:

takes

the

contents

file

the

form

string

and

searches

for

all

the

C++

class

definitions,

storing

their

locations

map

strings/int's

typedef

std::map<std::string,

std::string::difference_type,

std::less<std::string>

map_type;

const

char*

possibly

leading

whitespace:

"^[[:space:]]*"

possible

template

declaration:

"(template[[:space:]]*<[^;:{]+>[[:space:]]*)?"

class

struct:

"(class|struct)[[:space:]]*"

leading

declspec

macros

etc:

"("

"\\<\\w+\\>"

"("

"[[:blank:]]*\$[^)]*\$"

")?"

"[[:space:]]*"

")*"

the

class

name

"(\\<\\w*\\>)[[:space:]]*"

template

specialisation

parameters

"(<[^;:{]+>)?[[:space:]]*"

terminate

{

"(\\{|:[^;\\{()]*\\{)";

boost::regex

expression(re);

map_type

class_index;

bool

regex_callback(const

boost::match_results<std::string::const_iterator>&

what)

{

what[0]

contains

the

whole

string

what[5]

contains

the

class

name.

what[6]

contains

the

template

specialisation

any.

add

class

name

and

position

map:

class_index[what[5].str()

what[6].str()]

what.position(5);

return

true;

}

void

load_file(std::string&

std::istream&

is)

{

s.erase();

if(is.bad())

return;

s.reserve(static_cast<std::string::size_type>(is.rdbuf()->in_avail()));

char

while(is.get(c))

{

if(s.capacity()

s.size())

s.reserve(s.capacity()

3);

s.append(1,

c);

}

int

main(int

argc,

const

char**

argv)

{

std::string

text;

for(int

argc;

++i)

{

cout

"Processing

file

argv[i]

endl;

std::ifstream

fs(argv[i]);

load_file(text,

fs);

fs.close();

construct

our

iterators:

boost::sregex_iterator

m1(text.begin(),

text.end(),

expression);

boost::sregex_iterator

m2;

std::for_each(m1,

m2,

®ex_callback);

copy

results:

cout

class_index.size()

matches

found"

endl;

map_type::iterator

class_index.begin();

class_index.end();

while(c

{

cout

"class

\""

(*c).first

"\"

found

index:

(*c).second

endl;

++c;

}

class_index.erase(class_index.begin(),

class_index.end());

}

return

}

pi@raspberrypi:~/boost

g++

-l

boost_regex

-Wall

main.cpp

&&./a.out

main.cpp

my_class.cpp

Processing

file

main.cpp

matches

found

Processing

file

my_class.cpp

matches

found

class

"A"

found

index:

class

"B"

found

index:

pi@raspberrypi:~/boost

$boost正則表達式,將C++文件轉換為HTML文件

pi@raspberrypi:~/boost

cat

main.cpp

#include

#include

#include

#include

#include

#include

#include

#include

purpose:

takes

the

contents

file

and

transform

syntax

highlighted

code

html

format

boost::regex

e1,

e2;

extern

const

char*

expression_text;

extern

const

char*

format_string;

extern

const

char*

pre_expression;

extern

const

char*

pre_format;

extern

const

char*

header_text;

extern

const

char*

footer_text;

void

load_file(std::string&

std::istream&

is)

{

s.erase();

if(is.bad())

return;

s.reserve(static_cast<std::string::size_type>(is.rdbuf()->in_avail()));

char

while(is.get(c))

{

if(s.capacity()

s.size())

s.reserve(s.capacity()

3);

s.append(1,

c);

}

int

main(int

argc,

const

char**

argv)

{

try{

e1.assign(expression_text);

e2.assign(pre_expression);

for(int

argc;

++i)

{

std::cout

"Processing

file

argv[i]

std::endl;

std::ifstream

fs(argv[i]);

std::string

in;

load_file(in,

fs);

fs.close();

std::string

out_name

std::string(argv[i])

std::string(".htm");

std::ofstream

os(out_name.c_str());

header_text;

strip

'<'

and

'>'

first

outputting

temporary

string

stream

std::ostringstream

t(std::ios::out

std::ios::binary);

std::ostream_iterator<char>

oi(t);

boost::regex_replace(oi,

in.begin(),

in.end(),

e2,

pre_format,

boost::match_default

boost::format_all);

then

output

final

output

stream

adding

syntax

highlighting:

std::string

s(t.str());

std::ostream_iterator<char>

out(os);

boost::regex_replace(out,

s.begin(),

s.end(),

e1,

format_string,

boost::match_default

boost::format_all);

footer_text;

os.close();

}

catch(...)

{

return

-1;

}

return

}

const

char*

pre_expression

"(<)|(>)|(&)|\\r";

const

char*

pre_format

"(?1<)(?2>)(?3&)";

const

char*

expression_text

preprocessor

directives:

index

"(^[[:blank:]]*#(?:[^\\\\\\n]|\\\\[^\\n[:punct:][:word:]]*[\\n[:punct:][:word:]])*)|"

comment:

index

"(//[^\\n]*|/\\*.*?\\*/)|"

literals:

index

"\\<([+-]?(?:(?:0x[[:xdigit:]]+)|(?:(?:[[:digit:]]*\\.)?[[:digit:]]+(?:[eE][+-]?[[:digit:]]+)?))u?(?:(?:int(?:8|16|32|64))|L)?)\\>|"

string

literals:

index

"('(?:[^\\\\']|\\\\.)*'|\"(?:[^\\\\\"]|\\\\.)*\")|"

keywords:

index

;

const

char*

format_string

"(?1<font

color=\"#008040\">$&</font>)"

人人文庫> 全部分類> 專業(yè)文獻 > IT計算機

溫馨提示

1. 本站所有資源如無特殊說明，都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
2. 本站的文檔不包含任何第三方提供的附件圖紙等，如果需要附件，請聯(lián)系上傳者。文件的所有權益歸上傳用戶所有。
3. 本站RAR壓縮包中若帶圖紙，網頁內容里面會有圖紙預覽，若沒有圖紙預覽就沒有圖紙。
4. 未經權益所有人同意不得將文件中的內容挪作商業(yè)或盈利用途。
5. 人人文庫網僅提供信息存儲空間，僅對用戶上傳內容的表現(xiàn)方式做保護處理，對用戶上傳分享的文檔內容本身不做任何修改或編輯，并不能對任何下載內容負責。
6. 下載文件中如有侵權或不適當內容，請與我們聯(lián)系，我們立即糾正。
7. 本站不保證下載資源的準確性、安全性和完整性, 同時也不承擔用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。

【移動應用開發(fā)技術】4 C++ Boost 正則表達式

文檔簡介

溫馨提示

最新文檔

評論

【移動應用開發(fā)技術】4 C++ Boost 正則表達式

文檔簡介

溫馨提示

最新文檔

評論

相關文檔