perl - How to match these patterns using regex lookahead? -


I have to add files according to a pattern in different groups What I need is an ID Identifies a file related group. It does not matter how this ID is made (apart from it should not be empty), it should be similar to all the files in only one group. I am trying to create a direct ID of the file name according to this rule:

  • remove the alias or "signature" from the end of the base file name and the previous "
  • < Li> To avoid an empty ID
  • A dummy string (like "id") is done with
  • before the result
  • .
  • This should probably be possible with simple regex, but I do not get it to work.

    Here's my attempt:

    <(& Lt ; D Ta & gt;) for pre> {my ($ match) = ($ _ = ~ /(.*?)(?:dokument|signatur)?(?:\..*)/); print $ _ "= & Gt; ID "$ match" \ n ";}; __DATA__ dokument.pdf dokument.rtf dokument.html COO_2026_100_2_dokument.pdf COO_2026_100_2.zip dokument.xml signatur.xml COO_2026_100_2_dokument.xml COO_2026_100_2_dokument.rtf COO_2026_100_2_signatur.xml COO_2026_100_3_ dokument.xml < / Pre>

    What should be: - document. * and signatur. * Go to a group - * _ 2 * another group Go to - * _ 3 * goes to a third group

    What happens is that everything is fine except for a zip file, because its id does not have "_" Doubt it can be solved by looking at it further Yes, but I have no clue how and maybe I am wrong.

    Any ideas?

The idea of ​​a look is to match a given pattern, then it is done by another pattern (which is not included in the match). It is difficult to follow what you are looking for , But if I understand what you mean, it will work:

 . *? (? = _? ([Dokument | signatur | \. [^.] + $))  

This is either a document or signature And the previous _ if any, or none of those extensions are present.

Some notes:

  • You do not have to match the extension followed by document or But , you need to match it if none of them is found, because none of them is found, you are only adding all the things that comes before them (preceding _ , if any) in result
  • Matching an extension with \ .. * can work with these filenames, but this is not a reliable way to do this normally because if the file There are more than one point in the name, so it will match everything beginning with the first point. . [^.] + $ ensures that you are starting from the last point

In addition, there is no need to use a matching group or specify results in a variable. Just match the part of the file name you want to use, and give it to $ & amp; :

Recover for
 with  (& lt; data & gt;) {$ _ = ~ /.? * (? = _ (File | signatur | \ [^.] + $).) /; Print $ _ "= & gt; id" $ & Amp; . "\ N"; }  

Comments

Popular posts from this blog

mysql - How to enter php data into a html multiple select box -

java - Can't add JTree to JPanel of a JInternalFrame -

c++ - Cassandra datastax cpp driver - avoiding unnecessary copies -