Word Tables

jstevens
GoldLounger
Posts: 2631
Joined: 26 Jan 2010, 16:31
Location: Southern California

Word Tables

Post by jstevens »

I am opening a PDF file in Word and writing table information to an Excel workbook. I run into a challenge where a table's content isn't in a table structure.

Here is an example of the odd table structure:
EL_Screenshot 2022-12-15.png

Code: Select all

Dim wApp As New Word.Application
Dim wDoc As Word.Document

wApp.Visible = True

Set wDoc = wApp.Documents.Open(Path_to_filename, False, ReadOnly:=True)
Is it possible to convert this section of the table so that I can pull the individual table row contents? I have code that pulls table row contents. When I pull the contents I get one long concatenated string for "Trans Date, Transaction Description/Location and Amount". So three separate strings.

Your suggestions are appreciated.
You do not have the required permissions to view the files attached to this post.
Regards,
John

User avatar
HansV
Administrator
Posts: 78545
Joined: 16 Jan 2010, 00:14
Status: Microsoft MVP
Location: Wageningen, The Netherlands

Re: Word Tables

Post by HansV »

Do you need to do this once or frequently? If once, I'd import into Excel, then use Data > Text to Columns to split the "faulty" rows into separate cells.

IF you need to do this repeatedly: could you attach a sample Word document? That would enable me (and others) to experiment.
Best wishes,
Hans

jstevens
GoldLounger
Posts: 2631
Joined: 26 Jan 2010, 16:31
Location: Southern California

Re: Word Tables

Post by jstevens »

Hans,

I have attached a Zipped file containing a sample Word document with tables.

Here is the Excel VBA code:

Code: Select all

Sub Sample_ReadFromPDF()

Dim wApp As New Word.Application
Dim wDoc As Word.Document
Dim pg As Word.Paragraph
Dim wLine As String
Dim tbCount As Integer
Dim tbIndx As Integer
Dim lr As Long
Dim oColCount As Long
Dim Path_Filename As String
Dim tRow As Long, tCol As Long

Path_Filename = "Z:\Documents\TestArea\EL_Sample_Word_Table2.docx"

Sheet2.Activate    'Make sure you have a sheet named "Sheet2"
Sheet2.Cells.ClearContents

wApp.Visible = True

Set wDoc = wApp.Documents.Open(Path_Filename, False, ReadOnly:=True)

tbCount = wDoc.Tables.Count

If wDoc.Tables.Count > 0 Then

    For tbIndx = 1 To tbCount
    
        With wDoc.Tables(tbIndx)
        
        Application.StatusBar = "Working on Table: " & tbIndx & " of " & tbCount
        
            lr = Sheet2.Range("A" & Rows.Count).End(xlUp).Row + 2
            Cells(lr, 1).Value = "Table-" & tbIndx
    
            For tRow = 1 To .Rows.Count
            lr = Sheet2.Range("A1048576").End(xlUp).Row
                
                For tCol = 1 To .Columns.Count
                
                On Error Resume Next
                    Cells(lr + 1, tCol).Value = WorksheetFunction.Trim(WorksheetFunction.Clean(.Cell(tRow, tCol).Range.Text))
                    On Error GoTo 0
    '                Stop
                    
                Next tCol
                
            Next tRow
        
        End With
        
    Next tbIndx

End If

'wDoc.Close False
'Set wDoc = Nothing

'wApp.Quit
'Set wApp = Nothing


Application.StatusBar = False

MsgBox "Finished"

End Sub

EL_Sample_Word_Table2.zip
You do not have the required permissions to view the files attached to this post.
Regards,
John

User avatar
HansV
Administrator
Posts: 78545
Joined: 16 Jan 2010, 00:14
Status: Microsoft MVP
Location: Wageningen, The Netherlands

Re: Word Tables

Post by HansV »

Oh dear - the text that should be in a table but isn't, resides in a series of rectangle shapes, but so does some of the other text. And even the text that should be in one cell is split into several rectangles. It will be virtually impossible to process it.
You should get hold of the source of the PDF file instead. If that is impossible, I fear you're out of luck.
Best wishes,
Hans

jstevens
GoldLounger
Posts: 2631
Joined: 26 Jan 2010, 16:31
Location: Southern California

Re: Word Tables

Post by jstevens »

Hans,

My first impression was the same as yours.

Thanks for taking a look.
Regards,
John

jstevens
GoldLounger
Posts: 2631
Joined: 26 Jan 2010, 16:31
Location: Southern California

Re: Word Tables

Post by jstevens »

Hans,

I was able to create a workaround using AdobeAcrobat Pro. I opened the PDF file with AdobeAcrobat and saved the file as a Word document which created the proper tables.
Regards,
John

User avatar
HansV
Administrator
Posts: 78545
Joined: 16 Jan 2010, 00:14
Status: Microsoft MVP
Location: Wageningen, The Netherlands

Re: Word Tables

Post by HansV »

Great to hear that!
Best wishes,
Hans

User avatar
LineLaline
2StarLounger
Posts: 194
Joined: 19 Sep 2022, 16:51

Re: Word Tables

Post by LineLaline »

Oh My God!
:laugh: :rofl: :laugh:
Okay, flashback to a client who had all 'artwork' done by 'professionals' in the USA.
One graphic contained a pie chart. But a section was taken out for a 'nice effect'. The entire illustration though was done in, gulp, PowerPoint! That 'missing slice' was created with a triangle, which was not enough, so they superimposed another rotated one, plus two more to get the angle right. My goddd... I told them if they decided to attend a proper Illustrator course I would throw in tma tutorial to create that graphic themselves :laugh:
They accepted.
Ceci n'est pas une signature.

jstevens
GoldLounger
Posts: 2631
Joined: 26 Jan 2010, 16:31
Location: Southern California

Re: Word Tables

Post by jstevens »

I found this online relating to PDF file conversion to different formats such as Excel, Word and etc..
Regards,
John

User avatar
HansV
Administrator
Posts: 78545
Joined: 16 Jan 2010, 00:14
Status: Microsoft MVP
Location: Wageningen, The Netherlands

Re: Word Tables

Post by HansV »

Thanks!
Best wishes,
Hans