VBA Remove BB Code Code Tags in text. Help with Wild cards

User avatar
Doc.AElstein
BronzeLounger
Posts: 1499
Joined: 28 Feb 2015, 13:11
Location: Hof, Bayern, Germany

VBA Remove BB Code Code Tags in text. Help with Wild cards

Post by Doc.AElstein »

Hello,
I’m considering a code alternative to take BB Code out of text and need help with the “replacing using Wildcards” in VBA in a VBA code….

So I have some text in Word documents that include BB Code Code Tags as well as Word formatting that matches the BB Code
So, for example , what you would actually see is this in the Word document:
[color=Blue]Sub[/color] BBCodeCodeTagsWegDaMit()
.Text = "" [color=Green]' ??? ' Search text in Wildcard Format ??[/color]

I would like to do a couple of similar things to it.
Both are basically removing the BB Code Code tags
_1) One way is automating doing it in Word.
_2) The other, still in a VBA code, is to take it out of the text copied to the clipboard

I have a code that almost does what I want:
Code Part 1)
Rem 1 ) Copies a selected text as it is
Rem 2) Makes a temporary Word document and pastes the selected stuff complete and saves it

Rem 3) I want this just to take out the BB Code, so I would get this back for the above example
Sub BBCodeCodeTagsWegDaMit()
.Text = "" ' ??? ' Search text in Wildcard Format ??
( So in the temporary word document I still have the color format )
I save that to a second temporary Word document "TempNoBBCodeCopy.docx"

( Rem 5) I close / kill those two files currently but I may not do that later )

Code part 2)
This is intended to take the text from the clipboard, remove the BB Code Code tags and then put the simple plain text back in the clipboard. So in the example I would end up with this in the clipboard

Sub BBCodeCodeTagsWegDaMit()
.Text = "" ' ??? ' Search text in Wildcard Format ??

_.....
So I had a go:
I got a lot of the way from here: http://www.eileenslounge.com/viewtopic. ... 03#p175712" onclick="window.open(this.href);return false;

But I am stuck on what I expect is doing things to do with replacing using wildcards. I have no experience with them , and I get a bad headache whenever I try to figure them out.

The full code is :
Sub BBCodeCodeTagsWegDaMit()
and is in this File “eileenslounge.docm”
https://app.box.com/s/mtwz5jil0p5nbymqrb1cgszjuyg94bea" onclick="window.open(this.href);return false;
It is in the Normal Code Module
NormalCodesBackUp

Below is a stripped down versions with just bits I am stuck on:

Code part 1) I expect I need just the correct wildcard bit here
.Text = "" ' ??? ' Search text in Wildcard Format ??

‘Possibly this is more subtle and I need like
.Text = [AnyText]ThisText[/AnyText]
‘and also to go with that
.Replacement.Text = ThisText

Code part 2) I am not sure what I need here, - basically I would like "Raped Text" to be what is in the variable TextWithBBCodeEnit but with all BB Code stuff removed. ( I know about Replace stuff and can manipulate strings quite well but doing it with wild cards I am unfamiliar with. ).

_- I am sure I can do it in a complicated way, like working backwards from the string end, looking for a combination of ] , [/ , ] and [ , and deleting those and all characters between them. I have done that a lot before, and I am having a go with that just now… . But it is about time I learnt how to do it properly…. But trying to figure out Wildcards make me go a bit ( more ) crazy…. (


Thanks
Alan



Shortened code:, ( where I need help is shown twice thus 'HELP !!!===Please )

Code: Select all

Sub BBCodeCodeTagsWegDaMitSHimpfGlified()
Rem Code Part 1) make two temporary Word Files with and withouut BB Code
Rem 1) Copy selection to Clipboard
 Selection.Copy
Rem 2) Make temporary WORD document
Documents.Add: ActiveDocument.Content.Paste
' 2b) Copy of Full Text with BB Code
Dim FullFilePathAndFullNameBBCode As String
 ActiveDocument.SaveAs FileName:="TempBBCodeCopy.docx", FileFormat:=wdFormatXMLDocument
 Let FullFilePathAndFullNameBBCode = ActiveDocument.Path & "\" & ActiveDocument.Name
Rem 3) From Han's Text Find Replacement Dialogue 'http://www.eileenslounge.com/viewtopic.php?f=26&t=22603#p175712
' 3a) Take out all BB Code
 Selection.WholeStory
    With Selection.Find
    .ClearFormatting: .Replacement.ClearFormatting
    .Wrap = wdFindStop
    '.MatchWildcards = False    ' Don't use wildcards. The default anyway, but in this code is an important ???
    .MatchWildcards = True ' ??? need this
    'HELP !!!===Please
    .Text = "" ' ???   ' Search text for BB Code Code Tags in Wildcard Format ??
    .Replacement.Text = ""    ' Replace text is nothing. ?? Or more subtle
    '=================
    .Execute Replace:=wdReplaceAll
    End With
' 3b) Copy of Colored Text without BB Code Code tags
Dim FullFilePathAndFullNameNoBBCode As String
 ActiveDocument.SaveAs FileName:="TempNoBBCodeCopy.docx", FileFormat:=wdFormatXMLDocument
 Let FullFilePathAndFullNameNoBBCode = ActiveDocument.Path & "\" & ActiveDocument.Name
Rem 4) "Reset the "Find Replace Text Dialogue" "Thing" "
 ActiveDocument.Select
 Selection.WholeStory
    With Selection.Find
    .ClearFormatting: .Replacement.ClearFormatting: .Text = "": .Replacement.Text = "":  .Forward = True: .Wrap = wdFindAsk: .Format = False: .MatchCase = False: .MatchWholeWord = False: .MatchKashida = False: .MatchDiacritics = False: .MatchAlefHamza = False: .MatchControl = False: .MatchWildcards = False: .MatchSoundsLike = False: .MatchAllWordForms = False '
    End With
Rem 5) Option to close / kill document
 ActiveDocument.Close (wdDoNotSaveChanges)
 Kill FullFilePathAndFullNameBBCode
 Kill FullFilePathAndFullNameNoBBCode

Rem Code Part 2 Put text without BB code into clipboard
Rem 6) Use data Object ( assumes we copied to the Clipboard )
'1 a) Data Object to get the data from the clipboard.
Dim objCliTextCopied As Object '  Late Binding equivalent'
 Set objCliTextCopied = GetObject("New:{1C3B4210-F441-11CE-B9EA-00AA006B1A69}") '
'6a) Put all Clipboard Infomation into Data Object
 objCliTextCopied.GetFromClipboard
'6b) get code as long string String. This can be very long
Dim TextWithBBCodeEnit As String '
 Let TextWithBBCodeEnit = objCliTextCopied.GetText() ''retrieve the text
Rem 8) Take out BB Code bits from text string.
Dim RapedText As String
 'HELP======= Please :)
 Let RapedText = "Raped Text" ' I want here what is in TextWithBBCodeEnit but with BB Code Code tags removed
 '=====================
Rem 9) This a Another Object from class to be sure we have the data in the Clipboard
Dim objDat As Object
 Set objDat = GetObject("New:{1C3B4210-F441-11CE-B9EA-00AA006B1A69}")
'9b) Put in (and Get back from clipboard) the raped text.
 objDat.SetText RapedText
 objDat.PutInClipboard
End Sub
I am having difficulty logging in with this account just now.
You can find me at DocAElstein also

User avatar
HansV
Administrator
Posts: 78241
Joined: 16 Jan 2010, 00:14
Status: Microsoft MVP
Location: Wageningen, The Netherlands

Re: VBA Remove BB Code Code Tags in text. Help with Wild ca

Post by HansV »

If your document doesn't contain [...] pairs other than BBCode tags, you can use

Code: Select all

Sub WegDaMit()
    ActiveDocument.Content.Find.Execute FindText:="\[*\]", ReplaceWith:="", _
        MatchWildcards:=True, Format:=False, Replace:=wdReplaceAll
End Sub
Best wishes,
Hans

User avatar
Doc.AElstein
BronzeLounger
Posts: 1499
Joined: 28 Feb 2015, 13:11
Location: Hof, Bayern, Germany

Re: VBA Remove BB Code Code Tags in text. Help with Wild ca

Post by Doc.AElstein »

HansV wrote:..... you can use
...
ActiveDocument.Content.Find.Execute FindText:="\[*\]", ReplaceWith:="", _
MatchWildcards:=True, Format:=False, Replace:=wdReplaceAll
Thanks Hans,

I had a go at adapting that to my simplified code. It works great.

Code: Select all

' 3a) Take out all BB Code
 Selection.WholeStory
    With Selection.Find
    .ClearFormatting: .Replacement.ClearFormatting
    .Wrap = wdFindStop
    .MatchWildcards = True ' ??? need this
    'HELP !!!===Please
    .Text = "\[*\]" ' Search text for single BB Code Code Tag in Wildcard Format
    '=================
    .Replacement.Text = ""    ' Replace text is nothing.
    .Execute Replace:=wdReplaceAll
    End With
So I guess this is saying :
Find.Text is
[ & AnythingInIt & ]

So I wondered if I could modify that somehow to do catch a valid pair.
[ & AnythingInIt & ] & Anything & [/ & Anything & ]

I had a go

This appears to works to .Find a valid pair I think,
.Text = "\[*\]*\[/*\]" ' Search for valid pair

BUT it loses what is in between_.....

_.....You showed me how to do something interesting with the .Replacement.Text here: http://www.eileenslounge.com/viewtopic. ... 03#p175750" onclick="window.open(this.href);return false;
which was this bit ^& . So I did
.Replacement.Text = "BBCRD^&BBCRD"

Then I repeat the .Find.Text .Find.Replacement.Text twice to get rid of the BBCRD’s stuck onto a BB Code Code Tag

It seems to work in most cases..... one problem I see is that a lone [ puts a spanner in the works.
But it is getting close.

Thanks again

Alan
_.............


Code Part 1 almost there:
It will turn like this_....
[color=Blue]Sub[/color] BBCodeCodeTagsWegDaMit()
.Text = "" [color=Green]' ??? ' Search text in Wildcard Format ??[/color]
_... into this:
Sub BBCodeCodeTagsWegDaMit()
.Text = "" ' ??? ' Search text in Wildcard Format ??

Code: Select all

Sub WegDaMitSHGHans1()
Rem Code Part 1) make two temporary Word Files with and withouut BB Code
Rem 1) Copy selection to Clipboard
 Selection.Copy
Rem 2) Make temporary WORD document
Documents.Add: ActiveDocument.Content.Paste
' 2b) Copy of Full Text with BB Code
Dim FullFilePathAndFullNameBBCode As String
 ActiveDocument.SaveAs FileName:="TempBBCodeCopy.docx", FileFormat:=wdFormatXMLDocument
 Let FullFilePathAndFullNameBBCode = ActiveDocument.Path & "\" & ActiveDocument.Name
Rem 3) From Han's Text Find Replacement Dialogue 'http://www.eileenslounge.com/viewtopic.php?f=26&t=22603#p175712
' 3a) Take out all BB Code
' 3a)(i) Enclose valid BB Code Tag Pair in  BBCRD bits
 Selection.WholeStory
    With Selection.Find
    .ClearFormatting: .Replacement.ClearFormatting
    .Wrap = wdFindStop
    .MatchWildcards = True '
    .Text = "\[*\]*\[/*\]" ' Search for valid pair
    '=================
    .Replacement.Text = "BBCRD^&BBCRD" '
    .Execute Replace:=wdReplaceAll
    End With
' 3a)(ii) Take out first BBCRD and BB Code crap ;)
 Selection.WholeStory
    With Selection.Find
    .ClearFormatting: .Replacement.ClearFormatting
    .Wrap = wdFindStop
    .MatchWildcards = True '
    .Text = "\BBCRD\[*\]" '
    '=================
    .Replacement.Text = ""   ' Replace text is nothing.
    .Execute Replace:=wdReplaceAll
    End With
' 3a)(iii) Take out first BBCRD and BB Code crap ;)
 Selection.WholeStory
    With Selection.Find
    .ClearFormatting: .Replacement.ClearFormatting
    .Wrap = wdFindStop
    .MatchWildcards = True '
    .Text = "\[/*\]BBCRD" '
    '=================
    .Replacement.Text = ""   ' Replace text is nothing.
    .Execute Replace:=wdReplaceAll
    End With
' 3b) Copy of Colored Text without BB Code Code tags
Dim FullFilePathAndFullNameNoBBCode As String
 ActiveDocument.SaveAs FileName:="TempNoBBCodeCopy.docx", FileFormat:=wdFormatXMLDocument
 Let FullFilePathAndFullNameNoBBCode = ActiveDocument.Path & "\" & ActiveDocument.Name
Rem 4) "Reset the "Find Replace Text Dialogue" "Thing" "
 ActiveDocument.Select
 Selection.WholeStory
    With Selection.Find
    .ClearFormatting: .Replacement.ClearFormatting: .Text = "": .Replacement.Text = "":  .Forward = True: .Wrap = wdFindAsk: .Format = False: .MatchCase = False: .MatchWholeWord = False: .MatchKashida = False: .MatchDiacritics = False: .MatchAlefHamza = False: .MatchControl = False: .MatchWildcards = False: .MatchSoundsLike = False: .MatchAllWordForms = False '
    End With
Rem 5) Option to close / kill document
 ActiveDocument.Close (wdDoNotSaveChanges)
 'Kill FullFilePathAndFullNameBBCode
 'Kill FullFilePathAndFullNameNoBBCode
End Sub ' End Code Part 1)
I am having difficulty logging in with this account just now.
You can find me at DocAElstein also

User avatar
HansV
Administrator
Posts: 78241
Joined: 16 Jan 2010, 00:14
Status: Microsoft MVP
Location: Wageningen, The Netherlands

Re: VBA Remove BB Code Code Tags in text. Help with Wild ca

Post by HansV »

Is this better?

Code: Select all

Sub WegDaMit()
    ActiveDocument.Content.Find.Execute FindText:="\[*\](*)\[/*\]", ReplaceWith:="\1", _
        MatchWildcards:=True, Format:=False, Replace:=wdReplaceAll
End Sub
Best wishes,
Hans

User avatar
Doc.AElstein
BronzeLounger
Posts: 1499
Joined: 28 Feb 2015, 13:11
Location: Hof, Bayern, Germany

Re: VBA Remove BB Code Code Tags in text. Help with Wild ca

Post by Doc.AElstein »

Thanks again Hans,
HansV wrote:Is this better?
ActiveDocument.Content.Find.Execute FindText:="\[*\](*)\[/*\]", ReplaceWith:="\1", _
MatchWildcards:=True, Format:=False, Replace:=wdReplaceAll
It is doing what my last code does but a lot tidier, and it was what I had tried to find....
Doc.AElstein wrote:......
‘Possibly this is more subtle and I need like
.Text = [AnyText]ThisText[/AnyText]
‘and also to go with that
.Replacement.Text = ThisText
..

I suppose that is telling me that I can define/ mark a section with a ( ) , then each section is numbered from the left. I looked for stuff like that and probably still would be for a very long time.
Thanks very much. That’s a great help :)

Alan


Code: Select all

Sub WegDaMitSHGHans2()
Rem Code Part 1) make two temporary Word Files with and withouut BB Code
Rem 1) Copy selection to Clipboard
 Selection.Copy
Rem 2) Make temporary WORD document
Documents.Add: ActiveDocument.Content.Paste
' 2b) Copy of Full Text with BB Code
Dim FullFilePathAndFullNameBBCode As String
 ActiveDocument.SaveAs FileName:="TempBBCodeCopy.docx", FileFormat:=wdFormatXMLDocument
 Let FullFilePathAndFullNameBBCode = ActiveDocument.Path & "\" & ActiveDocument.Name
Rem 3) Replace Code tag pairs with what is in between
 Selection.WholeStory
    With Selection.Find
    .ClearFormatting: .Replacement.ClearFormatting
    .Wrap = wdFindStop
    .MatchWildcards = True '
    .Text = "(\[*\])(*)(\[/*\])" ' "\[*\]*\[/*\]" ' Search for valid pair mark/ definne sections with ( )'s
    '=================
    .Replacement.Text = "\2" '
    .Execute Replace:=wdReplaceAll
    End With
' 3b) Copy of Colored Text without BB Code Code tags
Dim FullFilePathAndFullNameNoBBCode As String
 ActiveDocument.SaveAs FileName:="TempNoBBCodeCopy.docx", FileFormat:=wdFormatXMLDocument
 Let FullFilePathAndFullNameNoBBCode = ActiveDocument.Path & "\" & ActiveDocument.Name
Rem 4) "Reset the "Find Replace Text Dialogue" "Thing" "
 ActiveDocument.Select
 Selection.WholeStory
    With Selection.Find
    .ClearFormatting: .Replacement.ClearFormatting: .Text = "": .Replacement.Text = "":  .Forward = True: .Wrap = wdFindAsk: .Format = False: .MatchCase = False: .MatchWholeWord = False: .MatchKashida = False: .MatchDiacritics = False: .MatchAlefHamza = False: .MatchControl = False: .MatchWildcards = False: .MatchSoundsLike = False: .MatchAllWordForms = False '
    End With
Rem 5) Option to close / kill document
 ActiveDocument.Close (wdDoNotSaveChanges)
 'Kill FullFilePathAndFullNameBBCode
 'Kill FullFilePathAndFullNameNoBBCode
End Sub ' End Code Part 1)
I am having difficulty logging in with this account just now.
You can find me at DocAElstein also

User avatar
macropod
4StarLounger
Posts: 508
Joined: 17 Dec 2010, 03:14

Re: VBA Remove BB Code Code Tags in text. Help with Wild ca

Post by macropod »

If you want to find & delete only matched tags, you could use a wildcard Find/Replace, where:
Find = \[([!\]=]@)*\](*)\[/\1\]
Replace = \2
Paul Edstein
[Fmr MS MVP - Word]

User avatar
Doc.AElstein
BronzeLounger
Posts: 1499
Joined: 28 Feb 2015, 13:11
Location: Hof, Bayern, Germany

Re: VBA Remove BB Code Code Tags in text. Help with Wild ca

Post by Doc.AElstein »

macropod wrote:.. only matched tags....wildcard Find/Replace....
Find = \[([!\]=]@)*\](*)\[/\1\]
Replace = \2
Hi Paul
Thanks for the reply,
I was puzzled at first....
Then I thought Find Replace sounds like a Dialogue Box Idea....
So.. had a play.... ran a macro recording_.....
PaulFindReplace.jpg http://imgur.com/HeGhT1L" onclick="window.open(this.href);return false;
http://www.gmayor.com/replace_using_wildcards.htm" onclick="window.open(this.href);return false;
PaulFindReplaceMacroRecording.jpg http://imgur.com/4Nqj1HP" onclick="window.open(this.href);return false;
_.....
Bingo !!!! Perfect!!

I think is exactly what I was looking for. It appears to pick out the exact string and somehow does not get fooled by things like a rogue [ in text
I have tried very hard to understand how it is working, but when I try to follow the literature on Wildcards it really gives me a headache..
If you have the time can you break the "\[*\]*\[/*\]" down and explain it for me.
No rush.
Thanks again, I am very grateful Paul.
Alan

_.......
Code Part 1 with "\[*\]*\[/*\]" InIt

Code: Select all

Sub WegDaMitPaulMod() '    http://www.eileenslounge.com/viewtopic.php?f=26&t=26030&p=202107#p202107
Rem Code Part 1) make two temporary Word Files with and without BB Code
Rem 1) Copy selection to Clipboard
 Selection.Copy
Rem 2) Make temporary WORD document
Documents.Add: ActiveDocument.Content.Paste
' 2b) Copy of Full Text with BB Code
Dim FullFilePathAndFullNameBBCode As String
 ActiveDocument.SaveAs FileName:="TempBBCodeCopy.docx", FileFormat:=wdFormatXMLDocument
 Let FullFilePathAndFullNameBBCode = ActiveDocument.Path & "\" & ActiveDocument.Name
Rem 3) Replace Code tag pairs with what is in between
 Selection.WholeStory
    With Selection.Find
    .ClearFormatting: .Replacement.ClearFormatting
    .Wrap = wdFindStop
    .MatchWildcards = True '
    .Text = "\[([!\]=]@)*\](*)\[/\1\]"
    '=================
    .Replacement.Text = "\2" '
    .Execute Replace:=wdReplaceAll
    End With
' 3b) Copy of Colored Text without BB Code Code tags
Dim FullFilePathAndFullNameNoBBCode As String
 ActiveDocument.SaveAs FileName:="TempNoBBCodeCopy.docx", FileFormat:=wdFormatXMLDocument
 Let FullFilePathAndFullNameNoBBCode = ActiveDocument.Path & "\" & ActiveDocument.Name
Rem 4) "Reset the "Find Replace Text Dialogue" "Thing" "
 ActiveDocument.Select
 Selection.WholeStory
    With Selection.Find
    .ClearFormatting: .Replacement.ClearFormatting: .Text = "": .Replacement.Text = "":  .Forward = True: .Wrap = wdFindAsk: .Format = False: .MatchCase = False: .MatchWholeWord = False: .MatchKashida = False: .MatchDiacritics = False: .MatchAlefHamza = False: .MatchControl = False: .MatchWildcards = False: .MatchSoundsLike = False: .MatchAllWordForms = False '
    End With
Rem 5) Option to close / kill document
 ActiveDocument.Close (wdDoNotSaveChanges)
 'Kill FullFilePathAndFullNameBBCode
 'Kill FullFilePathAndFullNameNoBBCode
End Sub ' End Code Part 1)
_.....

P.s. After many hours I am almost finished a very long string manipulation code to do the same for part 2 of my code working on the full string of the character text put in the clipboard. I don’t need that now , as I can get the perfect text from the final Selection Whole Story thing from the above code. Maybe I will complete it and post it or a link to it anyway, just to show a comparison of the way not to do it, lol... :)
I am having difficulty logging in with this account just now.
You can find me at DocAElstein also

User avatar
macropod
4StarLounger
Posts: 508
Joined: 17 Dec 2010, 03:14

Re: VBA Remove BB Code Code Tags in text. Help with Wild ca

Post by macropod »

Doc.AElstein wrote:If you have the time can you break the "\[*\]*\[/*\]" down and explain it for me.
I'll break down the \[([!\]=]@)*\](*)\[/\1\] for you:
\[ = an opening [.
([!\]=]@) = any string of characters that does not include a closing ] or an =. Putting it in parentheses stores whatever this string matches for re-use. This captures the opening part of the Code Tags that you need to match for the closing expression.
*\] = any string of characters terminating in a closing ].
(*)\[ = any string of characters terminating in an opening [. Putting the astersik in parentheses stores whatever it matches for re-use. This captures the part of the code you want to retain.
/\1\] = a / character, followed by whatever the first parenthetic expression matched, terminated in a closing ]. Together, these form the closing side of the Code Tag pair.
Paul Edstein
[Fmr MS MVP - Word]

User avatar
Leif
Administrator
Posts: 7193
Joined: 15 Jan 2010, 22:52
Location: Middle of England

Re: VBA Remove BB Code Code Tags in text. Help with Wild ca

Post by Leif »

Doc.AElstein wrote:
x.jpg
Are you intentionally disguising the filenames? With some skins, "LightGrey" font is not easy to read!
You do not have the required permissions to view the files attached to this post.
Leif

User avatar
Doc.AElstein
BronzeLounger
Posts: 1499
Joined: 28 Feb 2015, 13:11
Location: Hof, Bayern, Germany

Re: VBA Remove BB Code Code Tags in text. Help with Wild ca

Post by Doc.AElstein »

Leif wrote:....Are you intentionally disguising ... "LightGrey" font is not easy to read!
Hi Leif.
It was just for my own reference. It is the name I have it stored under. I see it clearly. I did not appreciate it was difficult for anyone to see. ( They don’t need to , but I guess it might irritate ). I will use grey instead. Thanks for the heads up
( A different color just helps remind me what it is. I use Imgur rather than the built in facility sometimes if it makes the thread a bit easier to follow when the pictures are only for reference and not directly needed )
Alan
Last edited by Doc.AElstein on 11 Feb 2017, 14:08, edited 1 time in total.
I am having difficulty logging in with this account just now.
You can find me at DocAElstein also

User avatar
Doc.AElstein
BronzeLounger
Posts: 1499
Joined: 28 Feb 2015, 13:11
Location: Hof, Bayern, Germany

Re: VBA Remove BB Code Code Tags in text. Help with Wild ca

Post by Doc.AElstein »

macropod wrote:....I'll break down the \[([!\]=]@)*\](*)\[/\1\] for you:....
([!\]=]@) = any string of characters that does not include a closing ] or an =......
Hi Paul
Thanks again.
I think I have it almost.
I changed it a bit just to get it a bit clearer in my head. ..
.Text = "(\[)([!\]=]@)(*\])(*)(\[/\2\])"
Or
.Text = "(\[)([!\]=]@)(*\])(*)(\[/)(\2\])"
Or
.Text = "(\[)([!\]=]@)(*\])(*)(\[/)(\2)(\])" ' (1)(2)(3)(4)(5)(6)(7) - 7 bits

.Replacement.Text = "\4" ' This is for the last one with ( 7 ) bits

Looking at the last .Text, these parts I understand fully now, thanks to you :)
_(1) (\[) Opening [
_(2) ? – I know what it does but not how
**
_(3) (*\]) any string of characters terminating in a closing ]
_(4) (*) any string of characters. This is the bit between the BB Code Code Tags
_(5) (\[/) an opening two character bit like [/
_(6) (\2) This is exactly what (2) found”
_(7) (\]) a closing ]
(5)(6)(7) This is the complete closing BB Code Code Tag


The part I am struggling on is this: [!\]=]@
I think I know what it is doing. I understand that it finds the part of the text like in from this
color=#BFFFFF
it would give me
color

I am not following **how that is coming from [!\]=]@
I have really done my head in for a few hours on the internet but I am not understanding how this bit works. – Every time I read something I come up with a different answer.


Can you help me again on this last bit, please?
Thanks
Alan



P.S. EDIT: Is it possible to make this solution case insensitive?
I mean this _..
Any text
_..is “valid” ( or “works” ) as BB Code, but isn’t caught currently
Thanks
I am having difficulty logging in with this account just now.
You can find me at DocAElstein also

User avatar
macropod
4StarLounger
Posts: 508
Joined: 17 Dec 2010, 03:14

Re: VBA Remove BB Code Code Tags in text. Help with Wild ca

Post by macropod »

As I said:
([!\]=]@) = any string of characters that does not include a closing ] or an =.
Breaking that down further:
• the outer [] pair is the normal wildcard means of specifying the 'find' as being any of the characters between them.
• the ! is the normal wildcard means of specifying the 'find' as not being any of the characters that follow. In this case, the !\]= tells Word to exclude both a closing ] and an = - anything else is valid to 'find'. Thus, the \[([!\]=]@) tells Word to find an opening [ followed by any string of characters that does not include a closing ] or an =.
IMHO, changing the Find expression as you did adds nothing to its clarity - and doing so on more complex find expressions is liable to result in an 'expression too complex' error.
As for:
Doc.AElstein wrote:Is it possible to make this solution case insensitive?
I mean this _..
Any text
_..is “valid” ( or “works” ) as BB Code, but isn’t caught currently
Thanks
the whole expression is case agnostic. The only way case might become an issue is if someone used something like:

Code: Select all

[Color=Blue]Sub[/color]
In that situation, a match won't be made because what is found by ([!\]=]@) is not the same as the string being searched for later in the expression via the \1. This cannot be overcome with a wildcard Find.
Paul Edstein
[Fmr MS MVP - Word]

User avatar
Doc.AElstein
BronzeLounger
Posts: 1499
Joined: 28 Feb 2015, 13:11
Location: Hof, Bayern, Germany

Re: VBA Remove BB Code Code Tags in text. Help with Wild ca

Post by Doc.AElstein »

Thanks, Paul, I really appreciate the patience. I know you are understandably naffed off with me.

I am getting there. I think what you are saying is close to one of the answers I was coming up with, so it really helps when someone who really knows can explain it.
I was a bit thrown off with ““closing” ]” I think. You were referring to a ] and a =

I think I was seeing we have an outer [ ] . And somehow a \ “starts” something so that allows you to put in a ] after it and that will not then be taken as the closing ]

I had thought that the ! is the normal wildcard means of specifying the 'find' as not being any of the characters that follow.. I had read it but it really does help when you say it, as there is some different / incorrect info out there which can through one off. ( People like you are the real authorities on this stuff! ).



The last thing I getting not getting is the @
According to the literature it is telling me “any amount of ] or =”. Without that it does not work. I guess then it is looking for specifically one = and one ] .
I am struggling again.. :(
I am 3 hours into experimenting but not getting there.
Any chance of more help here, sorry to trouble you. You have given me a great solution.
So I am really needing now just to see what that @ is doing, and why it does not work without it

Thanks again
Alan
I am having difficulty logging in with this account just now.
You can find me at DocAElstein also

User avatar
StuartR
Administrator
Posts: 12577
Joined: 16 Jan 2010, 15:49
Location: London, Europe

Re: VBA Remove BB Code Code Tags in text. Help with Wild ca

Post by StuartR »

Alan,
Doc.AElstein wrote:...
The last thing I getting not getting is the @
According to the literature it is telling me “any amount of ] or =”. Without that it does not work. I guess then it is looking for specifically one = and one ] .
...
The @ sign tells find to look for any amount of NOT ] or =
It will match strings like Color, or Blue, or Bold, or any other string of any length that doesn't include ] or =
StuartR


User avatar
Doc.AElstein
BronzeLounger
Posts: 1499
Joined: 28 Feb 2015, 13:11
Location: Hof, Bayern, Germany

Re: VBA Remove BB Code Code Tags in text. Help with Wild ca

Post by Doc.AElstein »

Thanks Stuart.
Actually I understood that.
It was an oversight / Typo that I wrote “any amount of ] or =
I had meant to say that from the literature I understood "any amount of what is before it," in this case that is , as you clearly say, “any amount of NOT ] or =
But it does not get me further with understanding what it is doing here , and why without it it is not working. :sad: :scratch:
Thanks for catching that
Alan
I am having difficulty logging in with this account just now.
You can find me at DocAElstein also

User avatar
Doc.AElstein
BronzeLounger
Posts: 1499
Joined: 28 Feb 2015, 13:11
Location: Hof, Bayern, Germany

Re: VBA Remove BB Code Code Tags in text. Help with Wild ca

Post by Doc.AElstein »

How does this sound...
Without the @, this [!\]=] is going to find in the word color 5 things
c
o
l
o
r
The complete string from Paul is going to try to match 5 times an end tag having between [/ and ] c o l o or r
There is no such words as c o l o or r in any end tag

With the @, this [!\]=]@ is going to find in the word color one thing
color
The complete string from Paul is going to try to match an end tag with color
That is what is wanted

Hmm_.....


_.....
Assuming I might have sussed it... then one thin what made it a bit difficult... was I tried stepping through each Find with variations of the search string with the Find Replace dialogue box thing:_....
Find5TimesTellItOnce.JPG http://imgur.com/zjH9bzK" onclick="window.open(this.href);return false;
_... I was hoping at some point it might like find something 5 times in some cases. But I guess it does the search , list everything it finds once , or searches one at a time, but once something it is found it remembers so ignores it if it finds it again. Hence it only “showed” a hit once.
Last edited by Doc.AElstein on 12 Feb 2017, 14:00, edited 1 time in total.
I am having difficulty logging in with this account just now.
You can find me at DocAElstein also

User avatar
HansV
Administrator
Posts: 78241
Joined: 16 Jan 2010, 00:14
Status: Microsoft MVP
Location: Wageningen, The Netherlands

Re: VBA Remove BB Code Code Tags in text. Help with Wild ca

Post by HansV »

Take the text

[color=green]This is green[/color]

\[ finds the opening "[". We must precede [ with a \ to tell Word that we are looking for the literal character "[" because otherwise [ has a special meaning in wildcard searches.
[!\]=] searches for a single character that is not "]" or "=". Again, we use \] to tell Word that we mean the literal character "]".
So in the above example, [!\]=] finds the character "c" (the first letter of "color") after the "[". It does NOT find "o", "l", etc. because these characters aren't adjacent to the "[".
But [!\]=]@ finds ANY number of characters that doesn't equal "]" or "=".
In the above example, it finds the word "color": it stops when it encounters the "=" after "color".

@ is shorthand for {1,} i.e. one or more of the character specified to the left of {1,} or @.

More general:

{1,4} means from 1 to 4 (inclusive) occurrences of the character.
{2,} means 2 or more occurrences of the character.

So for example:
[0-9]{2-4} means 2, 3 or 4 digits, e.g. "12" or "345" or "6789".
[a-z]{1,} means 1 or more lower case letters, e.g. "z" or "abc" or "qwertyuiop". This is equivalent to [a-z]@
Best wishes,
Hans

User avatar
Doc.AElstein
BronzeLounger
Posts: 1499
Joined: 28 Feb 2015, 13:11
Location: Hof, Bayern, Germany

Re: VBA Remove BB Code Code Tags in text. Help with Wild ca

Post by Doc.AElstein »

Hi Hans
HansV wrote:... [!\]=] finds the character "c" (the first letter of "color") after the "[". It does NOT find "o", "l", etc. because these characters aren't adjacent to the "["......
Ahh, thanks. I had gone astray as in my experimenting I had isolated the [!\]=] sometimes. My last conclusions may have been OK but I was not quite talking about the situation at hand...

I am following all the rest you wrote. ( I remembered the ; instead of , http://www.eileenslounge.com/viewtopic. ... 03#p175730" onclick="window.open(this.href);return false; :) )

Also , I think , as already pointed out to me \2 will find the second thing I put in a ( ) etc.....

That’s all very helpful, thanks. The literature is very confusing and often incomplete. Without the experience it can really be a brain killer.

I have a feeling I have it finally.

Alan

_................

@ Paul
Re_....
macropod wrote:.....
IMHO, changing the Find expression as you did adds nothing to its clarity
_.. I am sure you are correct. This is just my personal preference, and I am sure most people would agree with you. I may also agree later when I have more experience_... and
macropod wrote:.........- and doing so on more complex find expressions is liable to result in an 'expression too complex' error.
That is a very useful thing to know and valid point. I have heard that doing things generally with wild card matches and the such can be a real killer on memory etc, due to its complexity, so is a case likely when my preferred style can be very un wise. I will bare it in mind. Thanks again.
Alan
P.S. I am into the first hour of googling "agnostic" just now, lol.. :smile:
_..............





_.I need to clear my brain a bit, then I may just follow up and do a summary as a lot of useful stuff is here. ( I finished a fairly complicated code to do the same. I will share that as well just for fun. :) )
I am having difficulty logging in with this account just now.
You can find me at DocAElstein also

User avatar
Doc.AElstein
BronzeLounger
Posts: 1499
Joined: 28 Feb 2015, 13:11
Location: Hof, Bayern, Germany

Wild Things. You make my heart sing. They make everything..

Post by Doc.AElstein »

Wild Things. You make my heart sing. They make everything….. Groovy
:smile:
Wild Things. You make my heart sing. They make everything….. Groovy
http://listenonrepeat.com/watch/?v=Hce7 ... Wild_Thing" onclick="window.open(this.href);return false; :music:
Wild things…. :hairout: I think I hate you :scratch: .. But I wanna know for sure… so

A Summary :)( Word Wild card Solution to Part 1 )
As example from this sample string _..
[color=green]This is green[/color] This ‘aint matey
_.. we want this:
This is green This ‘aint matey
We use a combination of Wild stuff , or like a pattern to search for [ Find ] _..
[color=green]This is green[/color]
_.. and [ Replace ] that “Find”ed thing with
This is green
We can do, it Wildly, man ( well I can’t actually, but Hans and Paul can! )
( The point of doing this in word is to retain the color format in the text after removing the BB Code Code Tags. – I currently have it in a format like [color=Purple]ThisIsAGayColor,Purple[/color] )

As always the search is done from left to right

The way I do it below is probably not the most efficient way, in fact it is probably pretty stupid, but is just to demo the idea
Optional use of ( ) in the Wild thing search string
We can use brackets ( ) , optionally , to identify, ( for later use ) , the exact string parts found by the constituent wild things in the ( ) in which they are in:
So in this example we will use a ( ) for
_ the wild things that find the color so that the actual word color can be used again in the same wild stuff string, so as to make sure that we find in the end tag the matching word in the start tag.
_ Also we need a ( ) to identify the wild things that Find This is green , as we want to use that in , or rather for the complete, Replace string.
A bracket is identified by , for example, \2 if I want to reference the actual string found by the wild things we enclose in a the second ( )

Build up final Wild thing Replace string, left to right, by breaking string we want up into bits. Each bit will be found by a Wild thing or things
So we are looking to break the to search for [ Find ] string down thus:

Code: Select all

[ &  ( AnythingWithout=or] )  &  anything  &  ]  & (Anything)  &  [\ & \1 & ]
So I am breaking it down in to 8 bits so need 8 sections of wild stuff to Find those.
\1 is at the 7th position and refers to what is actually found by the search for ( AnythingWithout=or] ) at position 2 ( which is bracket ( ) number 1 )

Excel starts at the left. What no literature really states clearly ( and is only obvious once you know it ) is this:
_ For most sections, Excel keeps going to the right until it Finds what the Wild thing tells it to. That is then stored, but only if that found string also has joined to it the string type in the next section’s Wild thing. Those two sections are then stored ( as well as storing separately if either of those two sections are enclosed in ( ) ), but only if that found string also has joined to it the string type in the next section’s Wild thing. That is then stored, but only if that found string also has joined to it the string type in the next section’s Wild thing. ……..etc.
A common Wild thing is * . this means anything characters and any amount. This will only include, however, anything up to if the next Wild thing is satisfied.

The 7th Wild thing I already have which is \1 which is referring to the 1st bit enclosed in ( ) , which is the second bit in the broken up string , ( which is anything without a = or a ] . In our example this will be color ) . Note: The number refers to the number from the left of any ( ) that I may have included, not the actual sections.
Just to make that clear I will do this_..

Code: Select all

[   &  ( AnthingWithout=or] )  (  & Anything &    ] )  &  (Anything)  &    [/  &   \1   &    ]
_.. so my Replace is then \3 . I do not need the second ( ) , referenced by \2, but it can help you to get an “expression too complex” error, should you be silly enough to want to do that.
I put the second ( ) in also to demo that a ( ) can include more than one wild bit

So here is one example Find sting with Wild things that “works” ( I added spaces to show better the 8 sections, but in use those spaces must be removed ) there are three optional ( ) sections. The second ( ) section encloses two Wild thing sections, the third and forth parts counting from the left.

Code: Select all

 [\[]    ([!\]=]@)    (*     \])    (*)    \[/    (\1)     \]
This is the actual Wild Find string [\[]([!=\]]@)(*\])(*)\[/(\1)\]

To ExPlain:
_..It was sadly a bit too big and Beautiful to fit in here :( :sad:
A full explanation can be found here:
“Wild Things. You make my heart sing. They make everything….. Groovy”
http://www.excelfox.com/forum/showthrea ... #post10110" onclick="window.open(this.href);return false;

Following the description is a simplified VBA code to use above Find Replace technique.
http://www.excelfox.com/forum/showthrea ... #post10111" onclick="window.open(this.href);return false;

_.................

Solution Part 2. ( Code part to manipulate a text string containing BB Code Code Tags copied to the clipboard)
The most sensible way to do this would be to paste the text string from the clipboard to a Word doco, then run the last code above, then copy the selected whole story modified text back to the clipboard of the.
But I did it in a very long insensible way because I could. It does have the advantage of not being case sensitive on the BB Code Code Tag word , but I doubt if that is a good enough excuse to justify the complexity. I expect the code is not as full proof as the Find Replace one above. But maybe I will combine them above into a very very Big and Beautiful code which uses both techniques and then does a check for the same results as a sort of “Belt and Braces” or just plain “stupid” approach.
The code and explanations are , sadly , :( too pretty, Big and Beautiful to fit in here. :sad:
Here is the explanations:
http://www.excelfox.com/forum/showthrea ... #post10108" onclick="window.open(this.href);return false;
The code is in the Post after it.

Alan

EDIT:March 2017: here i was able to explain another Thread question and solution, based on what I learnt here:
https://www.excelforum.com/word-formatt ... ost4604396" onclick="window.open(this.href);return false;
Last edited by Doc.AElstein on 14 Mar 2017, 10:11, edited 1 time in total.
I am having difficulty logging in with this account just now.
You can find me at DocAElstein also

User avatar
Doc.AElstein
BronzeLounger
Posts: 1499
Joined: 28 Feb 2015, 13:11
Location: Hof, Bayern, Germany

Re: VBA Remove BB Code Code Tags in text. Help with Wild ca

Post by Doc.AElstein »

Hi,
Just a last feedback on this Thread in case anyone might be wanting a code of this sort. (Arguably not too likely.. lol.. )

I took the code I developed with the help here and the long simple string manipulation code I did and spliced them together.. Codes and description for that start about here:
http://www.excelforum.com/development-t ... ost4586804" onclick="window.open(this.href);return false;
( Or alternatively here, where you do not have to be logged in:
http://www.excelfox.com/forum/showthrea ... #post10117" onclick="window.open(this.href);return false; )
The code is much longer than it needs to be, but can be used for the basis of tailoring it to a particular need: This code is simply doing the two ways to remove BB Code code tag pairs from a copied to the clipboard Text from a Word document.
A check is then done Finally to see if the two raped from tags strings are the same.

It seems initially to work fine for any text with BB Code code tags in it similar to any text from a post here. ( I have not considered conversion of text including BB Code constructed table. Any test parts like that will need to be avoided )

_........__

I did just to wrap it up consider a code to cover the case of “Nested code tags” of similar type”.
I will try to share the brief details of that in the next post:
I am having difficulty logging in with this account just now.
You can find me at DocAElstein also