Read rtf file in vba

YasserKhalil
PlatinumLounger
Posts: 4936
Joined: 31 Aug 2016, 09:02

Read rtf file in vba

Post by YasserKhalil »

Hello everyone
I have .rtf file and I need to make it readable in excel vba

This is the code I have but the contents are weird

Code: Select all

Sub A01_MyTest()
    Dim myFile As String
    Dim text As String
    Dim lines() As String
    Dim i As Long
    Dim title As String
    Dim pageNumber As String
    myFile = ThisWorkbook.Path & "\Sample.rtf"
    Open myFile For Input As #1
    text = Input(LOF(1), 1)
    Close #1
    lines = Split(text, vbCrLf)
    
End Sub
You do not have the required permissions to view the files attached to this post.

User avatar
HansV
Administrator
Posts: 78783
Joined: 16 Jan 2010, 00:14
Status: Microsoft MVP
Location: Wageningen, The Netherlands

Re: Read rtf file in vba

Post by HansV »

A .rtf file is not a plain text file. It is comparable to a Word document - in fact, you can open it in Word, or in WordPad.
This is what part of the file looks like in Word:

S2546.png

As you can see, it is not really suitable for opening it in Excel.
You do not have the required permissions to view the files attached to this post.
Best wishes,
Hans

User avatar
SpeakEasy
4StarLounger
Posts: 593
Joined: 27 Jun 2021, 10:46

Re: Read rtf file in vba

Post by SpeakEasy »

So, do you want to read and display the contents as they are (i.e what Hans has shown), or just extract the text content? Both are possible. Which do you need?

YasserKhalil
PlatinumLounger
Posts: 4936
Joined: 31 Aug 2016, 09:02

Re: Read rtf file in vba

Post by YasserKhalil »

I just need to make it readable (as for the contents) in excel vba exactly as any word document.

User avatar
DocAElstein
5StarLounger
Posts: 639
Joined: 18 Jan 2022, 15:59
Location: Re-routing rivers, in Hof, Beautiful Bavaria

Re: Read rtf file in vba

Post by DocAElstein »

Hi
I had an idea to get all the text only. I can get it to work, but only with Early Binding, so you must set a reference to Word
( Coding works from Excel VBA )

This is my idea, (just to get the text)
_ Use coding in Excel VBA to open Sample.rtf with/ in Word
_ Save that file using Excel VBA as a Word .htm file
( _ Now you should have a .htm file (which you can open with a text editor and it looks like this https://i.postimg.cc/02XgrDTk/Word-htm- ... editor.jpg ) )
_ Now you can Open that .htm file with coding as you are familiar to get the text, and manipulate it


Example: This should put all the text in a variable Pages
If this works, then you know how to put that into a spreadsheet as / if you want it )
Coding goes in Excel file

Code: Select all

 Option Explicit
Sub A01_MyTest()  '    http://www.eileenslounge.com/viewtopic.php?p=316904#p316904
Rem 0
Rem 1 Open the .rtf file with  Word  then save it as a  Word  htm  file
'Dim Wrd As Object  ' The problem with Late Binding is that I cannot use  , FileFormat:=wdFormatHTML
' Set Wrd = CreateObject("Word.Application")
Dim Wrd As Word.Application ' Set referrence
 Set Wrd = New Word.Application
 Let Wrd.Visible = True
 Wrd.Documents.Open Filename:=ThisWorkbook.Path & "\Sample.rtf"
 'Wrd.ActiveDocument.SaveAs2 Filename:=ThisWorkbook.Path & "\Sample.htm", FileFormat:=wdFormatHTML
 'Wrd.ActiveDocument.SaveAs2 Filename:=ThisWorkbook.Path & "\Sample.htm", FileFormat:=8
 'Wrd.ActiveDocument.SaveAs2 Filename:=ThisWorkbook.Path & "\Sample.htm"  ' This works but I get strange format with  Late Binding -
 Wrd.ActiveDocument.SaveAs Filename:=ThisWorkbook.Path & "\Sample.htm", FileFormat:=wdFormatHTML
 Wrd.ActiveDocument.Close
 Wrd.Quit
 Set Wrd = Nothing
Rem 2  Get the  htm  file as long text string
Dim myFile As String, Text As String
 'Let myFile = ThisWorkbook.Path & "\Sample.htm"
 Let myFile = ThisWorkbook.Path & "\Sample correct htm text.txt"
Dim FrFl As Long
 Let FrFl = FreeFile(0)
Open myFile For Input As FrFl
 Let Text = Input(LOF(FrFl), FrFl)
Close #FrFl

Rem 3 Pick out the main text bit
Dim WrdSec As Long, Pos1 As Long: Let WrdSec = 1
 Let Pos1 = InStr(1, Text, "<div class=WordSection" & WrdSec & ">", vbBinaryCompare) ' This is where main text usually starts
    
    Do While Pos1 <> 0 ' ===============================================
    Dim Pos2 As Long
     Let Pos2 = InStr(Pos1 + 24, Text, "</div>", vbBinaryCompare) ' This is usually wher main text finishes
    Dim Pages As String
     Let Pages = Pages & Mid(Text, Pos1 + 24, Pos2 - (Pos1 + 24))
     Let Pages = RemoveHTML(Pages) ' get rid of most  html  stuff
     Let Pages = Replace(Pages, "&nbsp;", "", 1, -1, vbBinaryCompare)
     
     Let WrdSec = WrdSec + 1
     Let Pos1 = InStr(Pos2, Text, "<div class=WordSection" & WrdSec & ">", vbBinaryCompare)
    Loop ' While Pos1 <> 0 ' ===========================================
Rem 4 Output
 Let ThisWorkbook.Worksheets.Item(1).Range("A1") = Pages
End Sub 
Note: I do not understand this coding too well. I do not know much Word VBA. Some of it I got from the macro recorder, and some I found on internet. Most probably there is a better way.
It is working for me with your .rtf file, so far in Office 2013 and Office 2007
Just for example, Rem 4 , it puts all text in first cell, but I am sure you know many ways to put that text, Pages, it in Excel differently. ( You know how to Split by vbCr & vbLf etc. etc.
https://i.postimg.cc/PxKdQ4fC/Put-all-t ... t-cell.jpg
Put all text in first cell.JPG


Alan

P.S. ( One problem I have is that I cannot get it to work with Late Binding, because with Late binding, I cannot seem to get the Format correct at the .SaveAs2 code line ? Maybe someone else can help with that? )
You do not have the required permissions to view the files attached to this post.
Last edited by DocAElstein on 01 May 2024, 08:08, edited 1 time in total.
I seriously don’t ever try to annoy. Maybe I am just the kid that missed being told about the King’s new magic suit, :(

YasserKhalil
PlatinumLounger
Posts: 4936
Joined: 31 Aug 2016, 09:02

Re: Read rtf file in vba

Post by YasserKhalil »

I tried the file you attached but got an error `Invalid procedure call or argument ` at the first line `Sub A01_MyTest()`

User avatar
DocAElstein
5StarLounger
Posts: 639
Joined: 18 Jan 2022, 15:59
Location: Re-routing rivers, in Hof, Beautiful Bavaria

Re: Read rtf file in vba

Post by DocAElstein »

I forgot to post the function you need also, ( but all coding was and is in the file also , so I don't understand that error if you used the coding in the file??? )
( Remember also I said ... you must set a reference to Word )

Code: Select all

Function RemoveHTML(Text As String) As String  '  https://stackoverflow.com/questions/47366923/remove-html-tags-from-string-in-excel-vba   https://eileenslounge.com/viewtopic.php?p=316929#p316929
Dim regexObject As Object
Set regexObject = CreateObject("vbscript.regexp")

    With regexObject
        .Pattern = "<!*[^<>]*>"    'html tags and comments
        .Global = True
        .IgnoreCase = True
        .MultiLine = True
    End With

 Let RemoveHTML = regexObject.Replace(Text, "")
End Function
_.________________________________________________-

( P.S. For anyone else passing
( if anyone passing could
_ explain clearly enough that I can understand, how this bit works
"<!*[^<>]*>" 'html tags and comments
,and
_ how I can get this to work with Late Binding, ( problem is in SaveAs line – I cannot get to give a working argument to make the final format in the .htm file correct ). Failing that, an explanation as to why it cannot be done would be welcome - Is this one of those rare occaisions when Early amd Late Binding makes an important difference?
, then I would be grateful, thanks )
Last edited by DocAElstein on 01 May 2024, 08:51, edited 2 times in total.
I seriously don’t ever try to annoy. Maybe I am just the kid that missed being told about the King’s new magic suit, :(

YasserKhalil
PlatinumLounger
Posts: 4936
Joined: 31 Aug 2016, 09:02

Re: Read rtf file in vba

Post by YasserKhalil »

I got an error `File not found` at this line `Open myFile For Input As FrFl`
I think this is the cause
`Let myFile = ThisWorkbook.Path & "\Sample correct htm text.txt"`

User avatar
DocAElstein
5StarLounger
Posts: 639
Joined: 18 Jan 2022, 15:59
Location: Re-routing rivers, in Hof, Beautiful Bavaria

Re: Read rtf file in vba

Post by DocAElstein »

Yes sorry, it should be the line I commented out,
Let myFile = ThisWorkbook.Path & "\Sample.htm"
I seriously don’t ever try to annoy. Maybe I am just the kid that missed being told about the King’s new magic suit, :(

YasserKhalil
PlatinumLounger
Posts: 4936
Joined: 31 Aug 2016, 09:02

Re: Read rtf file in vba

Post by YasserKhalil »

I have uncommented the line and the same problem persists.

User avatar
DocAElstein
5StarLounger
Posts: 639
Joined: 18 Jan 2022, 15:59
Location: Re-routing rivers, in Hof, Beautiful Bavaria

Re: Read rtf file in vba

Post by DocAElstein »

you must (obviously) also remove the next (erroring line) line
Let myFile = ThisWorkbook.Path & "\Sample correct htm text.txt"

Code: Select all

Option Explicit
Sub A01_MyTest()  '    http://www.eileenslounge.com/viewtopic.php?p=316904#p316904
Rem 0
Rem 1 Open the .rtf file with  Word  then save it as a  Word  htm  file
'Dim Wrd As Object  ' The problem with Late Binding is that I cannot use  , FileFormat:=wdFormatHTML
' Set Wrd = CreateObject("Word.Application")
Dim Wrd As Word.Application ' Set referrence
 Set Wrd = New Word.Application
 Let Wrd.Visible = True
 Wrd.Documents.Open Filename:=ThisWorkbook.Path & "\Sample.rtf"
 'Wrd.ActiveDocument.SaveAs2 Filename:=ThisWorkbook.Path & "\Sample.htm", FileFormat:=wdFormatHTML
 'Wrd.ActiveDocument.SaveAs2 Filename:=ThisWorkbook.Path & "\Sample.htm", FileFormat:=8
 'Wrd.ActiveDocument.SaveAs2 Filename:=ThisWorkbook.Path & "\Sample.htm"  ' This works but I get strange format with  Late Binding -
 Wrd.ActiveDocument.SaveAs Filename:=ThisWorkbook.Path & "\Sample.htm", FileFormat:=wdFormatHTML
 Wrd.ActiveDocument.Close
 Wrd.Quit
 Set Wrd = Nothing
Rem 2  Get the  htm  file as long text string
Dim myFile As String, Text As String
 Let myFile = ThisWorkbook.Path & "\Sample.htm"
 'Let myFile = ThisWorkbook.Path & "\Sample correct htm text.txt"
Dim FrFl As Long
 Let FrFl = FreeFile(0)
Open myFile For Input As FrFl
 Let Text = Input(LOF(FrFl), FrFl)
Close #FrFl

Rem 3 Pick out the main text bit
Dim WrdSec As Long, Pos1 As Long: Let WrdSec = 1
 Let Pos1 = InStr(1, Text, "<div class=WordSection" & WrdSec & ">", vbBinaryCompare) ' This is where main text usually starts
    
    Do While Pos1 <> 0 ' ===============================================
    Dim Pos2 As Long
     Let Pos2 = InStr(Pos1 + 24, Text, "</div>", vbBinaryCompare) ' This is usually wher main text finishes
    Dim Pages As String
     Let Pages = Pages & Mid(Text, Pos1 + 24, Pos2 - (Pos1 + 24))
     Let Pages = RemoveHTML(Pages) ' get rid of most  html  stuff
     Let Pages = Replace(Pages, "&nbsp;", "", 1, -1, vbBinaryCompare)
     
     Let WrdSec = WrdSec + 1
     Let Pos1 = InStr(Pos2, Text, "<div class=WordSection" & WrdSec & ">", vbBinaryCompare)
    Loop ' While Pos1 <> 0 ' ===========================================
Rem 4 Output
 Let ThisWorkbook.Worksheets.Item(1).Range("A1") = Pages
End Sub
Function RemoveHTML(Text As String) As String  '  https://stackoverflow.com/questions/47366923/remove-html-tags-from-string-in-excel-vba  https://eileenslounge.com/viewtopic.php?p=316929#p316929
Dim regexObject As Object
Set regexObject = CreateObject("vbscript.regexp")

    With regexObject
        .Pattern = "<!*[^<>]*>"    'html tags and comments
        .Global = True
        .IgnoreCase = True
        .MultiLine = True
    End With

 Let RemoveHTML = regexObject.Replace(Text, "")
End Function
You do not have the required permissions to view the files attached to this post.
Last edited by DocAElstein on 02 May 2024, 06:22, edited 2 times in total.
I seriously don’t ever try to annoy. Maybe I am just the kid that missed being told about the King’s new magic suit, :(

User avatar
SpeakEasy
4StarLounger
Posts: 593
Joined: 27 Jun 2021, 10:46

Re: Read rtf file in vba

Post by SpeakEasy »

Ok, here's a shorter, more lightweight example.

Code: Select all

' Requires reference to Microsoft InkEdit control
Public Sub RTFExample()
    Dim myfile As String
    Dim hFile As Long
    Dim Text As String
    Dim ProxieObj As OLEObject
    Dim myink As InkEdit
    
    Set ProxieObj = ActiveSheet.OLEObjects.Add(ClassType:="InkEd.InkEdit.1", Link:=False, DisplayAsIcon:=False, Left:=588, Top:=50.4, Width:=363, Height:=317.4) ' by default this will be visible
    Set myink = ProxieObj.Object ' Ok, get a reference to the underlying RTF-capable InkEdit object

    myfile = "d:\downloads\deleteme\sample.rtf" 'your filename would go here
    
    ' Read source RTF as per your original code (almost)
    hFile = FreeFile
    Open myfile For Input As hFile
    Text = Input(LOF(1), 1)
    Close hFile
    
    myink.TextRTF = Text ' at this point the RTF document is visoible on the InkEdit control
    Text = myink.Text ' and at this point Text contains the plaintext version of the RTF file
    
    ' Do something with Text if you need
    
    'ProxieObj.Delete ' use this to get rid of our RTF display if we were only looking to extract text
End Sub

YasserKhalil
PlatinumLounger
Posts: 4936
Joined: 31 Aug 2016, 09:02

Re: Read rtf file in vba

Post by YasserKhalil »

Amazing. Thank you very much.

YasserKhalil
PlatinumLounger
Posts: 4936
Joined: 31 Aug 2016, 09:02

Re: Read rtf file in vba

Post by YasserKhalil »

......... to be deleted .............
Last edited by YasserKhalil on 01 May 2024, 10:04, edited 1 time in total.

User avatar
DocAElstein
5StarLounger
Posts: 639
Joined: 18 Jan 2022, 15:59
Location: Re-routing rivers, in Hof, Beautiful Bavaria

Re: Read rtf file in vba

Post by DocAElstein »

YasserKhalil wrote:
01 May 2024, 09:58
This line ``ProxieObj.Delete`` didn't work as I still get the object in the worksheet
.. getting ahead of yourself, once again, and not reading carefully all that is posted for you ! :smile: :) :sigh:
I think he edited that out, but never the less useful to know about, or it was .......
(, and by the way, its not useful, Yasser, to edit a post such that the following posts get confusing.
maybe you should always refresh, and rechack all previous posts before jumping in. It can be frustrating and unkind to people helping you, when you just quickly read the last bit you saw and immediately quickly reply, missing things unecerssarily, JIMHO)
_.________________________________


Looks very impressive, tested in Office 2007
Image

Code: Select all

'  https://eileenslounge.com/viewtopic.php?p=316936#p316936
' Requires reference to Microsoft InkEdit control
Public Sub RTFExample()
    Dim myfile As String
    Dim hFile As Long
    Dim Text As String
    Dim ProxieObj As OLEObject
    Dim myink As InkEdit
    
    Set ProxieObj = ActiveSheet.OLEObjects.Add(ClassType:="InkEd.InkEdit.1", Link:=False, DisplayAsIcon:=False, Left:=588, Top:=50.4, Width:=363, Height:=317.4) ' by default this will be visible
    Set myink = ProxieObj.Object ' Ok, get a reference to the underlying RTF-capable InkEdit object

'    myfile = "d:\downloads\deleteme\sample.rtf" 'your filename would go here
    myfile = ThisWorkbook.Path & "\Sample.rtf"
    
    ' Read source RTF as per your original code (almost)
    hFile = FreeFile
    Open myfile For Input As hFile
    Text = Input(LOF(1), 1)
    Close hFile
    
    myink.TextRTF = Text ' at this point the RTF document is visoible on the InkEdit control
    Text = myink.Text ' and at this point Text contains the plaintext version of the RTF file
    
    ' Do something with Text if you need
    
    'ProxieObj.Delete ' use this to get rid of our RTF display if we were only looking to extract text
Rem 4 Output
 Let ThisWorkbook.Worksheets.Item(1).Range("A2") = Text
End Sub

( It does not work in debug step mode, by the way : https://i.postimg.cc/K8TGH7XN/Switching ... -point.jpg )

_.________________________________
Share ‘Text from rtf file.xls’ https://app.box.com/s/jln958lihpvbtj4axk283afmrvu8wwi9
Last edited by DocAElstein on 01 May 2024, 10:59, edited 1 time in total.
I seriously don’t ever try to annoy. Maybe I am just the kid that missed being told about the King’s new magic suit, :(

User avatar
SpeakEasy
4StarLounger
Posts: 593
Joined: 27 Jun 2021, 10:46

Re: Read rtf file in vba

Post by SpeakEasy »

> explain clearly enough that I can understand, how this bit works
>"<!*[^<>]*>" 'html tags and comments

Have a quick look at the Patterns section here to see if that helps.

Here's a quick summary:

< matches the literal character <
!* matches zero or more ! characters
[^<>]* matches zero or more characters that are not < or >
> matches the literal character >

So, the regular expression matches HTML tags that start with < and end with >, with the possibility of ! characters occurring before the tag name - and then replaces those matches with an empty string, i.e effectively removing all tags

User avatar
SpeakEasy
4StarLounger
Posts: 593
Joined: 27 Jun 2021, 10:46

Re: Read rtf file in vba

Post by SpeakEasy »

>It does not work in debug step mode

Works fine in debug step mode here ...

Try CTL-SHIFT-F9 to remove all breakpoints ... crazy I know, but give it a go.

User avatar
DocAElstein
5StarLounger
Posts: 639
Joined: 18 Jan 2022, 15:59
Location: Re-routing rivers, in Hof, Beautiful Bavaria

Re: Read rtf file in vba

Post by DocAElstein »

SpeakEasy wrote:
01 May 2024, 10:38
..look at the Patterns section here ....

Here's a quick summary:.....
Thanks, I understand that. ( I have some Phobia or Autism when it comes to Wild things, and its very frustrating when I keep finding everywhere, as I did that Function RemoveHTML , but nowhere I find a quick ' comment to explain it, which, JIMHO, would generally increase the value of a shard Post tons.)
Thanks for the link: It still takes me a bit of effort to tie up an explanation to documentation like that. I think we are saying we have 4 separate things we want to look for , just as you listed, which helps make it more clearer. In fact re reading a few times both that documentation and what you wrote, its clear to me now, and I will certainly add a ' version of the explanation to my coding
The link looks like it might be a good older link *** ( <=2013, at a geuss, based on archive org waybaqk machine having copies from 2013 ). I find some of the better older stuff is often both the best , but also often the most difficult to hit on a search. I think people are figuring out now how to get their stuff up the Google search results or paying for it. I think when ChatGPT replaces all the forum answerers in a few months, there will still be a good job to be done to find or give the best explanations. (Maybe ChatGPT will figure out that as well, and/ or it will be in the paid version, Lol – at least then some of us will have half a chance to compete, maybe make are explanations amusing or something, Lol. )
_._____________________________________
SpeakEasy wrote:
01 May 2024, 11:05
Try CTL-SHIFT-F9 to remove all breakpoints ... crazy I know, but give it a go.
That has no effect, here**
, but I forgot to mention that I had also noticed, that if I had a brown round in the margin break point, even right at the end, but ran normally, than I also got that error, so maybe that supports the idea that it is some break issue/ Bug / feature.
And I just checked adding a Stop, and there is a similar issue for me: Put the Stop after the Set ProxieObj you get the issue, even with a normal run attempt. Put the Stop before the Set ProxieObj, and a normal run will go as far as the Stop ( as usually expected ) , and Stop, as expected. But hit the run again after it stopped, and the issue returns, so maybe that also supports the idea that it is some Break/ Stop issue/ Bug / feature.

**But I have only tried on Office 2007 and one computer, so far. I will make a mental note to check as I go along, and edit to say if it is different on other computers and versions. Maybe later that will reveal something later.


_.________________

*** ( Edit P.S. I will drop the link of an early archived version
https://web.archive.org/web/20130807131 ... ssions.htm
Because, Maybe I am paranoid, but I think I notice better stuff, vanishing
, and also noticed that try an old link by just clicking on it, and it may get redirected, which you might oversee , and then you copy the new link from the default browser URL bar after it pops up, or a search gives the new link, and that new link is not archived. Or do a google search and you get what appears an update link for what you are looking for, but it sometimes 'aint - Nasty little trick that Microsoft are using.
To find something on the Wayback machine it’s good to have the original link.
I noticed sometimes , get a new link on a Google search for some Microsoft documentation now, and you will see maybe in the URL
learn.microsoft.com
Change that bit to
docs.microsoft.com
instead , and look on the Wayback machine for that link. On average it seems better / more likely to get what you actually are looking for!

Generally, if you have an old link given, and want to try your luck on the Wayback machine, then best get in the habit of right click and copy link, rather than click on the link and copy what comes up in the URL bar of your default browser. Or rather, always look first on the Wayback machine with the original link, if you can get it.
I seriously don’t ever try to annoy. Maybe I am just the kid that missed being told about the King’s new magic suit, :(

User avatar
SpeakEasy
4StarLounger
Posts: 593
Joined: 27 Jun 2021, 10:46

Re: Read rtf file in vba

Post by SpeakEasy »

I'mm a big user of the Wayback Machine (probably nostalgioa for all the old documentation I originally learned stuff from).

Here are a few links to the documentation where I originally learned about regular expressions

Try this one in Wayback Machine: http://msdn2.microsoft.com/en-us/librar ... 85%29.aspx
This Iis the copy on learn: https://learn.microsoft.com/en-us/previ ... 2(v=vs.80)
And here's a 3rd party copy: https://admhelp.microfocus.com/uft/en/a ... 6a7353.htm

Whether this is useful to you or not, I don't know.

User avatar
DocAElstein
5StarLounger
Posts: 639
Joined: 18 Jan 2022, 15:59
Location: Re-routing rivers, in Hof, Beautiful Bavaria

Changing links and OLe-up Poppies :)

Post by DocAElstein »

Those 3 links are perhaps useful to demonstrate/ support what I was saying about Microsoft doing strange change things… ( unless I missed something again ) ….
The copy on learn claims to be an article from 2006. - Well I suppose technically speaking that statement could be correct. But it’s just a very tiny part of the original, (a brief not particularly useful introduction part), - so just a small part of what you get for the first link on Wayback Machine. On the Wayback machine the first link has a lot of captures around 2008. So the dates may tie up approximately.

Use the first link normally and you get redirected to a newer article that does have about half of what is in the original, and it looks more spacious, perhaps that ls in many people’s eyes, more pretty and less intimidating, or suits better a smart phone view, but it’s missing things. We touched on this phenomena a few weeks back on some other documentation – If someone already knows the subject and maybe at some point saw the original, then a quick glance at what’s at the new link may look to be the same. So in good faith they might pass it on, and can’t understand why the person does not seem to quite get out of it what you might have expected he should. (So they think he is an idiot, and move on, Lol! )

The third party link is a good copy of the original. That sort of thing was another possibility I was thinking about to get a good copy of something: ( I Had a quick go a while back myself at copying the html from a web page, and then, after enabling HTML at a forum post, I then, like an ignorant fool pasted it in…. it sort of seemed to work originally, but made a mess in the forum everywhere else, Lol. I had, of course, already heard that pasting a lot of html in a forum post was bad, .. but like the kid that had to touch the stove after his mum told him it was hot …
Never mind I learnt a few things as we fixed it . )
_.____

So, once again, it looks like having the original link, to use on Wayback machine, can be a good thing to have, especially for Microsoft stuff from the last 10-15 years
_.____

Me, and generally here at Eileen’s Lounge, not done so much with Reg Ex, or only a few posts I know of, so far,
https://eileenslounge.com/viewtopic.php ... 55#p176255
https://eileenslounge.com/viewtopic.php ... 08#p288908
https://eileenslounge.com/viewtopic.php ... 44#p316944

It has a strong rival here with the Wild thing in Word VBA, which some people here are very good with.
I have little experience myself, but I had heard that Reg Ex can fall down a bit working on large amounts of data, but I think you or someone told me more recently that is less so to do with the thing itself but more to do with using it in VBA.
_.____________________________________
_.______________________________________


Sometime in the future I must have another good look at your OLE offering . I have gone into a depression often enough for a year or two, when trying to make some Layman sense of what the whole COM, OLE, Actiive-X stuff is about. Very interesting though, what you did, and you can even copy and paste what is finally there and you get the formatting. Originally when I saw it, I thought it was some sort of image.
So
_ you are taking in as text the weird looking universal format text that is rtf, when opened with a text editor.
_ Your OLE object knows what to do with it, just as Word does.
It is interesting to see a working example of a COM, OLE, Actiive-X thing. They have mostly been to me a vague conceptual thing that I can’t understand.
Next time I am feeling masochistic enough to think about COM, OLE, Actiive-X stuff again I will take a re-look here.
As much as I gathered it's all some sort of deep down fundamental Microsoft innards stuff to allow embedding Poppy up things or permanently there things (and it helps to make Windows insecure and unstable). Perhaps what you have done is some home made version of things we can emplant / paste in something like Excel thanks to that underlying COM, OLE, Actiive-X innards stuff?
( My emplanted OLE Poppies are not going so well this year, They’re having some mental crises of wanting attention again. I have to keep telling them all is OK when they pop up, very annoying, but they don’t let me move on with things I wanted to keep going on in the background unless I personally manually step in and tell them all is OK . The ones outside where very beautiful last year)


( I think I noticed the deliberate mistake you put in to check if we was awake, - I expect Input(LOF(1), 1) should be better Input(LOF(hFile), hFile) ? )


Alan

_._______________________________________________



Hi
HansV wrote:
01 May 2024, 19:56
... turns out that each page is a section, so you actually want to extract sections
A quick question, partly out of interest, but it might have some small relevance as well … would these "sections" be the things I was seeing yesterday in my dismal attempt to get text from that .rtf file, ( via a stopover in a Word saved as a .htm file and then opened in a text editor) ? – These "<div" (container?) things
Word Sections in rtf htm text.JPG

Until yesterday I had assumed that there was just one of those, (maybe my files had been always small), and I had assumed they effectively held all the text. As it turned out there were 3 of them, so I had to loop and concatenate the text from them to get it all in my coding offering

Alan
You do not have the required permissions to view the files attached to this post.
Last edited by DocAElstein on 02 May 2024, 14:17, edited 1 time in total.
I seriously don’t ever try to annoy. Maybe I am just the kid that missed being told about the King’s new magic suit, :(