Some definitions for Regular Expression
- Literal, Metacharacter, target string, escape, sequence and search expression
Literal - A literal is any character we use in search or matching expression, for example to find ind in windows the ind is a literal string - each character plays a part in the search, it is literally the string we want to find.
Metachatacter - A metacharacter is one or more special characters that have a unique meaning and not NOT used as literals in the search expression, for example, the character ^ (circumflex or caret) is a metacharacter.
[ ]
-
^
1. A simple find and replace to change the indicator and first subfield code.
Find: =505 00$t
Replace: =505 0\$a
Result:
=505 0\$aYour love$g(4:17) --$tThe driver /$r(featuring Dierks Bentley and Eric Pasley)$g(4:34) --$tDancing around it$g(4:38) --$tSouthern accents /$r(featuring Stevie Nicks)$g(4:15) --$tLonely girl$g(2:59) --$tThe only one who gets me$g(3:46) --$tRound in circles$g(4:17) --$tI wish you were here /$r(featuring Miranda Lambert)$g(3:48) --$tLeaving Nashville$g(3:29).
2. Use the Edit Field Data tool to delete subfield coding.
Field: 505
Find: \$[grt]
Replace: $1[space]
Use regul ar expressions.
-Breakdown explanation:
\ escape sequence: An escape sequence is a way of indicating that we want to use one of our metacharacters a literal. In regular expression an escape sequence involves placing the metacharacter \ (backslash) in front of the metacharacter that we want to use as a literal. while [0123456789] means to any character in the range 0 to 9.
[ ] Match anything inside the square brackets for ONE character position, once and only once. For example, [12] means match the target to 1 and if that does not match then match the target to 2.
$ End of s string
The regex looks for a delimiter (the dollar sign) followed by one of the three subfield codes "g", "r", and "t", and replaces this with a space. (Your examples show no spaces around the subfield coding, but if there were some, you'd end up with double spaces to clean up as the last step.) Result:
=505 0\$aYour love (4:17) -- The driver / (featuring Dierks Bentley and Eric Pasley) (4:34) -- Dancing around it (4:38) -- Southern accents / (featuring Stevie Nicks) (4:15) -- Lonely girl (2:59) -- The only one who gets me (3:46) -- Round in circles (4:17) -- I wish you were here / (featuring Miranda Lambert) (3:48) -- Leaving Nashville (3:29).
3. Use the Edit Field Data tool to delete durations.
Field: 505
Find: [space]\(\d*?:\d+?\)
Replace: [nothing]
Use regular expressions.
Breakdown explanation:
( (Open parenthesis) and ) (close parenthesis) may be used to group (or bind) parts of our search expression together.
\d Match any character in the range 0-9 (equivalent of POSIX [:DIGIT:])
* 0 or more of previous expression
? 0 or 1 of previous expression; also forces minimal matching when an expression might match several strings within a search string.
+ 1 or more of previous expression
Of course, you don't type "[space]", just press the space bar to put in a space before that first backslash, and don't type "[nothing]", just leave the box completely empty. The regex looks for no or some digits between a left parenthesis and a colon, followed by at least one digit and then a right parenthesis. (If the records describe really long works with times that use two colons, this probably won't work. Or if times are sometimes given with words (e.g., "4 min., 17 sec."), you'd have to try another approach.) Result:
=505 0\$aYour love -- The driver / (featuring Dierks Bentley and Eric Pasley) -- Dancing around it -- Southern accents / (featuring Stevie Nicks) -- Lonely girl -- The only one who gets me -- Round in circles -- I wish you were here / (featuring Miranda Lambert) -- Leaving Nashville.
4. Use the Edit Field Data tool to delete the remaining parentheses.
Field: 505
Find: [\(\)]
Replace: [nothing]
Use regular expressions.
Breakdown explanation:
[ ] Match anything inside the square brackets for ONE character position, once and only once. For example, [12] means match the target to 1 and if that does not match then match the target to 2.
\ Preceding one of the above, it makes it a literal instead of a special character. Preceding a special matching character, see below
( ) Logical grouping of part of an expression
The regex looks for either a left or right parenthesis and simply deletes it. Result:
=505 0\$aYour love -- The driver / featuring Dierks Bentley and Eric Pasley -- Dancing around it -- Southern accents / featuring Stevie Nicks -- Lonely girl -- The only one who gets me -- Round in circles -- I wish you were here / featuring Miranda Lambert -- Leaving Nashville.
http://regexlib.com/CheatSheet.aspx?AspxAutoDetectCookieSupport=1
Literal - A literal is any character we use in search or matching expression, for example to find ind in windows the ind is a literal string - each character plays a part in the search, it is literally the string we want to find.
Metachatacter - A metacharacter is one or more special characters that have a unique meaning and not NOT used as literals in the search expression, for example, the character ^ (circumflex or caret) is a metacharacter.
[ ]
-
^
1. A simple find and replace to change the indicator and first subfield code.
Find: =505 00$t
Replace: =505 0\$a
Result:
=505 0\$aYour love$g(4:17) --$tThe driver /$r(featuring Dierks Bentley and Eric Pasley)$g(4:34) --$tDancing around it$g(4:38) --$tSouthern accents /$r(featuring Stevie Nicks)$g(4:15) --$tLonely girl$g(2:59) --$tThe only one who gets me$g(3:46) --$tRound in circles$g(4:17) --$tI wish you were here /$r(featuring Miranda Lambert)$g(3:48) --$tLeaving Nashville$g(3:29).
2. Use the Edit Field Data tool to delete subfield coding.
Field: 505
Find: \$[grt]
Replace: $1[space]
Use regul ar expressions.
-Breakdown explanation:
\ escape sequence: An escape sequence is a way of indicating that we want to use one of our metacharacters a literal. In regular expression an escape sequence involves placing the metacharacter \ (backslash) in front of the metacharacter that we want to use as a literal. while [0123456789] means to any character in the range 0 to 9.
[ ] Match anything inside the square brackets for ONE character position, once and only once. For example, [12] means match the target to 1 and if that does not match then match the target to 2.
$ End of s string
The regex looks for a delimiter (the dollar sign) followed by one of the three subfield codes "g", "r", and "t", and replaces this with a space. (Your examples show no spaces around the subfield coding, but if there were some, you'd end up with double spaces to clean up as the last step.) Result:
=505 0\$aYour love (4:17) -- The driver / (featuring Dierks Bentley and Eric Pasley) (4:34) -- Dancing around it (4:38) -- Southern accents / (featuring Stevie Nicks) (4:15) -- Lonely girl (2:59) -- The only one who gets me (3:46) -- Round in circles (4:17) -- I wish you were here / (featuring Miranda Lambert) (3:48) -- Leaving Nashville (3:29).
3. Use the Edit Field Data tool to delete durations.
Field: 505
Find: [space]\(\d*?:\d+?\)
Replace: [nothing]
Use regular expressions.
Breakdown explanation:
( (Open parenthesis) and ) (close parenthesis) may be used to group (or bind) parts of our search expression together.
\d Match any character in the range 0-9 (equivalent of POSIX [:DIGIT:])
* 0 or more of previous expression
? 0 or 1 of previous expression; also forces minimal matching when an expression might match several strings within a search string.
+ 1 or more of previous expression
Of course, you don't type "[space]", just press the space bar to put in a space before that first backslash, and don't type "[nothing]", just leave the box completely empty. The regex looks for no or some digits between a left parenthesis and a colon, followed by at least one digit and then a right parenthesis. (If the records describe really long works with times that use two colons, this probably won't work. Or if times are sometimes given with words (e.g., "4 min., 17 sec."), you'd have to try another approach.) Result:
=505 0\$aYour love -- The driver / (featuring Dierks Bentley and Eric Pasley) -- Dancing around it -- Southern accents / (featuring Stevie Nicks) -- Lonely girl -- The only one who gets me -- Round in circles -- I wish you were here / (featuring Miranda Lambert) -- Leaving Nashville.
4. Use the Edit Field Data tool to delete the remaining parentheses.
Field: 505
Find: [\(\)]
Replace: [nothing]
Use regular expressions.
Breakdown explanation:
[ ] Match anything inside the square brackets for ONE character position, once and only once. For example, [12] means match the target to 1 and if that does not match then match the target to 2.
\ Preceding one of the above, it makes it a literal instead of a special character. Preceding a special matching character, see below
( ) Logical grouping of part of an expression
The regex looks for either a left or right parenthesis and simply deletes it. Result:
=505 0\$aYour love -- The driver / featuring Dierks Bentley and Eric Pasley -- Dancing around it -- Southern accents / featuring Stevie Nicks -- Lonely girl -- The only one who gets me -- Round in circles -- I wish you were here / featuring Miranda Lambert -- Leaving Nashville.
MarcEdit
An introduction to the MARC record editing software MarcEdit
http://guides.library.illinois.edu/c.php?g=463460&p=3168242RegExLib.com Regular Expression Cheat Sheet (.NET)
http://regexlib.com/CheatSheet.aspx?AspxAutoDetectCookieSupport=1
留言
張貼留言