跳到主要內容

Regex - Converting formatted content notes (505) to unformatted

This is the question asked in the MarcEdit Listserv, and answered and solved by MarcEdit experts and librarians. Question: Converting formatted content notes (505) to unformatted. =505 00$tYour love$g(4:17) --$tThe driver /$r(featuring Dierks Bentley and Eric Pasley)$g(4:34) --$tDancing around it$g(4:38) --$tSouthern accents /$r(featuring Stevie Nicks)$g(4:15) --$tLonely girl$g(2:59) --$tThe only one who gets me$g(3:46) --$tRound in circles$g(4:17) --$tI wish you were here /$r(featuring Miranda Lambert)$g(3:48) --$tLeaving Nashville$g(3:29). To turn to =505 0\$aYour love -- The driver / featuring Dierks Bentley and Eric Pasley -- Dancing around it -- Southern accents / featuring Stevie Nicks -- Lonely girl -- The only one who gets me -- Round in circles -- I wish you were here / featuring Miranda Lambert -- Leaving Nashville. =505 0\$a[Contents] 1) By changing 2nd indicator to from 0 to "\" 2) Removing all subfields ($t, $r, $g) 3) Removing the time in parentheses, for example (4:17), (4:34), (4:38) 4) Removing the parentheses ( ) after /$r but keep featuring bla bla bla , for example /$r(featuring Miranda Lambert) turn to / featuring Miranda Lambert 5) Turning the first subfield into $a Another example: =505 00$tReasons for the tears I cry$g(3:54) --$tDown to my last bad habit$g(4:40) --$tMe and my girl$g(3:17) --$tLike my daddy did$g(3:09) --$tMake you feel real good$g(4:11) --$tI can't do this$g(3:27) --$tMy favorite movie$g(3:55) --$tOne more mistake I made /$r(featuring Chris Botti)$g(3:23) --$tTake me down /$r(featuring Little Big Town)$g(5:01) --$tI'll be waiting for you /$r(featuring Cam)$g(3:25) --$tWhen it's love$g(3:46) --$tSad one comin' on (a song for George Jones)$g(3:52). This is answered and solved by Walter F. Nickeson, Catalog & Metadata Management Librarian, Rush Rhees Library, University of Rochester Answer: This is a multi-step process, as far as I can see. The first step is a simple edit, the other steps all use the "Edit Field Data" tool with regular expressions. 1. A simple find and replace to change the indicator and first subfield code. Find: =505 00$t Replace: =505 0\$a Result: =505 0\$aYour love$g(4:17) --$tThe driver /$r(featuring Dierks Bentley and Eric Pasley)$g(4:34) --$tDancing around it$g(4:38) --$tSouthern accents /$r(featuring Stevie Nicks)$g(4:15) --$tLonely girl$g(2:59) --$tThe only one who gets me$g(3:46) --$tRound in circles$g(4:17) --$tI wish you were here /$r(featuring Miranda Lambert)$g(3:48) --$tLeaving Nashville$g(3:29). 2. Use the Edit Field Data tool to delete subfield coding. Field: 505 Find: \$[grt] Replace: $1[space] -> or just [space] Use regular expressions. The regex looks for a delimiter (the dollar sign) followed by one of the three subfield codes "g", "r", and "t", and replaces this with a space. (Your examples show no spaces around the subfield coding, but if there were some, you'd end up with double spaces to clean up as the last step.) Result: =505 0\$aYour love (4:17) -- The driver / (featuring Dierks Bentley and Eric Pasley) (4:34) -- Dancing around it (4:38) -- Southern accents / (featuring Stevie Nicks) (4:15) -- Lonely girl (2:59) -- The only one who gets me (3:46) -- Round in circles (4:17) -- I wish you were here / (featuring Miranda Lambert) (3:48) -- Leaving Nashville (3:29). 3. Use the Edit Field Data tool to delete durations. Field: 505 Find: [space]\(\d*?:\d+?\) Replace: [nothing] Use regular expressions. Of course, you don't type "[space]", just press the space bar to put in a space before that first backslash, and don't type "[nothing]", just leave the box completely empty. The regex looks for no or some digits between a left parenthesis and a colon, followed by at least one digit and then a right parenthesis. (If the records describe really long works with times that use two colons, this probably won't work. Or if times are sometimes given with words (e.g., "4 min., 17 sec."), you'd have to try another approach.) Result: =505 0\$aYour love -- The driver / (featuring Dierks Bentley and Eric Pasley) -- Dancing around it -- Southern accents / (featuring Stevie Nicks) -- Lonely girl -- The only one who gets me -- Round in circles -- I wish you were here / (featuring Miranda Lambert) -- Leaving Nashville. 4. Use the Edit Field Data tool to delete the remaining parentheses. Field: 505 Find: [\(\)] Replace: [nothing] Use regular expressions. The regex looks for either a left or right parenthesis and simply deletes it. Result: =505 0\$aYour love -- The driver / featuring Dierks Bentley and Eric Pasley -- Dancing around it -- Southern accents / featuring Stevie Nicks -- Lonely girl -- The only one who gets me -- Round in circles -- I wish you were here / featuring Miranda Lambert -- Leaving Nashville.

留言

這個網誌中的熱門文章

Exporting data FROM MarcEdit   http://cunycataloging.pbworks.com/w/page/25410674/Exporting%20data%20FROM%20MarcEdit              

Edit Subfield Data - 008 example

Scenario: =008  171126 t 20172016at\135\e\\\\\\\\\\vleng\d Change 008 position 6: 't'    --> to 'p' Turn to =008  171126 p 20172016at\135\e\\\\\\\\\\vleng\d Solve by... (1) Edit Subfield Data Field 008 Position: 6:1  (Position: length) (MarcEdit starts counting at zero) Find: t Replace with: p (2) Using regular expression  Find : (=008.{2}.{6})(t)(.*)  Replace : $1p$3  Note: MARC starts its count at zero and regular expression at 1