SEARCH

The How-To Geek Forums Have Migrated to Discourse

How-To Geek Forums / Microsoft Office

Grouping and Consolidation of data in Excel

(8 posts)
  • Started 1 year ago by Eligeol
  • Latest reply from Eligeol
  • Topic Viewed 782 times

Eligeol
Posts: 31

Hi again all – I hope I can draw on the collective expertise among you.

I have co-ordinate data exported from a (dxf) file in 2 columns and the way it is exported is thus:
There is a header line in the first column (see attached pic) which tells the software how many lines of co-ords to expect (and draw) – the premise here is that to draw a line between 2 (or more) points you first need the location of those points. The header number then confirms how many points there are and to join those points. The file I have has unfortunately got a whole bunch of lines only and I need to convert (basically join) these to form polygons (blocks/shapes). I have started this process manually in Excel but there are a HUGE number of these.

Consolidate data

Again referring to the attached pic – what I am looking for is a way of condensing the data so that I end up with something like the 2 columns on the right (column H & I).

Phase 1 of this process:
The way this works (from a “logical view”) and as the software I want to work with this data in would “see” it:
Starting with the header line (yellow highlight), the very next row is the start co-ord for the line (eventual polygon) – blue highlight, the next row after that “shows” the software the direction for the next line and so forth etc.
Ultimately the polygon must be closed – and here is the sticky point – to close it, the end /final co-ord must be exactly the same as the first (see Rows 3 & 16; 18 & 34) – pink highlight. The logic sequence I have used thus far is that I look for co-ords immediately below the header (yellow) line and find its matching co-ord that occurs immediately above a header line.

Phase 2:
Once I have this – you can see from the middle set of columns (D & E) that there are a few duplicate co-ords which are unnecessary – deleting them produces the final 2 columns (H & I) which I have spot-checked and produce the same result but much neater and compact in terms of data management.

P.S. The colour coding in Columns F & J are purely for visual help and don’t serve any other value (so they can be deleted).

I hope someone out there can help with this?
Thanks in advance

Posted 1 year ago
Top
 
moreeg
Posts: 842

Welcome back!

As usual this is an intriguing and complex problem. Can you confirm that;

1. You are looking to get from the series in Columns A and B to appear like the series in H and I.
2. You are not looking for any of the formatting (lines and colours) nor the chart to be produced

Can I make the following assumptions:

1. The source data will always come in pairs i.e. every 3rd row is always = 2
2. For each set i.e where beginning co-ordinates = ending co-ordinates it is sufficient to test only the X co-ordinate

This may take a while....

Posted 1 year ago
Top
 
moreeg
Posts: 842

Here's what I've come up with assuming that my assumptions are correct;

I've placed the results in new spreadsheet called Results. This needs to be created (once only)

here is the Macro


Sub Consolidate_Coords()

Sheets("Result").Activate

'Copy the input data to Sheet called "Results"

lRow = Sheets(1).Range("A" & Rows.Count).End(xlUp).Row
Range("A1:H" & lRow).ClearContents
Sheets(1).Range("A1:B" & lRow).Copy
Range("A1").PasteSpecial

'Gets rid of Rows with the Coordinate count which is always "2"
For i = 2 To lRow

If Cells(i, 1) = 2 Then
Cells(i, 1).EntireRow.Delete
End If

Next

'Gets rid of duplicate coordinates and copies the remainder to Columns D and E

lRow = Range("A" & Rows.Count).End(xlUp).Row

rCount = 2
For i = 2 To lRow
If Cells(i - 1, 1) <> Cells(i, 1) Then
Cells(rCount, 4) = Cells(i, 1)
Cells(rCount, 5) = Cells(i, 2)
rCount = rCount + 1
End If
Next

lRow = Range("D" & Rows.Count).End(xlUp).Row

'Finds sequences of coordinate = beginning = ending - and places a blank row between sequences
StartSeq = Cells(2, 4)
For i = 4 To lRow * 2

If Cells(i, 4) = StartSeq Then
Range(Cells(i + 1, 4), Cells(i + 1, 5)).Insert
StartSeq = Cells(i + 2, 4)
i = i + 2
AddRow = AddRow + 2
End If
Next

'Gets the coordinate count for the first sequence of coordinates

Cells(1, 4) = Application.Count(Range("D2:D" & Range("D2").End(xlDown).Row))
lRow = Range("D" & Rows.Count).End(xlUp).Row

'Gets the coordinate count for the remaining sequence sequences of coordinates

Do Until Range("D1").End(xlDown).Row >= lRow
StartSeq = Range("D1").End(xlDown).Offset(2, 0).Address
EndSeq = Range(StartSeq).End(xlDown).Address
Range(StartSeq).Offset(-1, 0) = Application.Count(Range(StartSeq, EndSeq))
Loop

'Adds Row headers
Range("D1:E1").Insert
Range("D1") = "X"
Range("E1") = "Y"

End Sub

Posted 1 year ago
Top
 
raphoenix
Posts: 14920

@moreeg,

Do you use Developer in Excel or just know and remember the Coding Syntax + Functions off the top of your head ?

You are Really Good with Macros and I envy your memory especially not having the actual sheet in front of you. :) :)

Rick P.

Posted 1 year ago
Top
 
moreeg
Posts: 842

Hi Rick

Unfortunately I have a memory like a goldfish - every lap around the bowl is a completely new experience.

I have some basics memorised and rely heavily on the screen tips/auto list members/auto syntax checker to keep me on the straight and narrow. If all that fails I go back to previous code I've written if applicable and finally to "that which knows everything" (google) as the last resort.

Oh, and to burst the bubble even further, I did have the sheet. Eligeol emailed it to me.

In the end though, I believe it's not necessarily how much code you know - I think that I tend to come up with solutions within the limitations of my VBA knowledge and experience which are intermediate at best. The challenge (and the fun) is conceptualizing how the solution could work and then, finally, coding it up.

But thanks for commenting. It'll be interesting to see if this solution is what Eligeol needs.

Posted 1 year ago
Top
 
Eligeol
Posts: 31

Hi moreeg,
Thanks for the prompt reply! Sorry mine is a little late.
I can confirm that I am looking to get from Column A & B to H & I and also that the formatting (lines and colours are not important)

To clarify your questions:
1. At this stage the source data is in pairs BUT I have also seen some of it in longer format (5 rows of co-ords) so this is not a "rule" necessarily. I would hope that in such a situation the 5 rows of co-ords would already represent a polygon and therefore that "set" would already look like the end product (like in Columns H & I)
2. The begin and end co-ords need to use both X and Y. This is more fool proof than just using one (X or Y) because the X or Y can possibly be repeated independently.

I'm not sure if this will change any of the macro? I will go through the sheet and let you know. Thanks again.

Posted 1 year ago
Top
 
moreeg
Posts: 842

Hi Eligeol

I've changed the macro to test for both X and Y to find the beginning and ending and instead of checking for a "2" in the first column I am checking for a Blank in the 2nd column.

If there are more than 2 samples in a sequence in the source data the macro should still work as expected.

See how this new version works for you.


Sub Consolidate_Coords()

Dim lRow As Long
Dim rCount As Long
Dim StartSeqX As Double
Dim StartSeqY As Double
Dim StartSeq
Dim EndSeq

Sheets("Result").Activate

Application.ScreenUpdating = False

'Copy the input data to Sheet called "Results"

lRow = Sheets(1).Range("A" & Rows.Count).End(xlUp).Row
Range("A1:H" & lRow).ClearContents
Sheets(1).Range("A1:B" & lRow).Copy
Range("A1").PasteSpecial

'Gets rid of Rows with the Coordinate count which is always "2"
For i = 2 To lRow

If Cells(i, 1) = 2 Then
Range(Cells(i, 1), Cells(i, 2)).Delete
End If

Next

'Gets rid of duplicate coordinates and copies the remainder to Columns D and E

lRow = Range("A" & Rows.Count).End(xlUp).Row

rCount = 2
For i = 2 To lRow
If Cells(i - 1, 1) <> Cells(i, 1) Then
Cells(rCount, 4) = Cells(i, 1)
Cells(rCount, 5) = Cells(i, 2)
rCount = rCount + 1
End If
Next

lRow = Range("D" & Rows.Count).End(xlUp).Row

'Finds sequences of coordinate = beginning = ending for X and Y- and places a blank row between sequences
'The blank row is used to find the count of each sequence and the count is placed in the blanked row

StartSeqX = Cells(2, 4)
StartSeqY = Cells(2, 5)

For i = 4 To lRow * 2

If Cells(i, 4) = StartSeqX And Cells(i, 5) = StartSeqY Then
Range(Cells(i + 1, 4), Cells(i + 1, 5)).Insert
StartSeqX = Cells(i + 2, 4)
StartSeqY = Cells(i + 2, 5)
i = i + 2
End If
Next

'Gets the coordinate count for the first sequence of coordinates

Cells(1, 4) = Application.Count(Range("D2:D" & Range("D2").End(xlDown).Row))
lRow = Range("D" & Rows.Count).End(xlUp).Row

'Gets the coordinate count for the remaining sequence sequences of coordinates

Do Until Range("D1").End(xlDown).Row >= lRow
StartSeq = Range("D1").End(xlDown).Offset(2, 0).Address
EndSeq = Range(StartSeq).End(xlDown).Address
Range(StartSeq).Offset(-1, 0) = Application.Count(Range(StartSeq, EndSeq))
Loop

'Adds Row headers
Range("D1:E1").Insert
Range("D1") = "X"
Range("E1") = "Y"
Range("A1").Activate
Range("A1").Select

End Sub

Posted 1 year ago
Top
 
Eligeol
Posts: 31

Great! Will check and let you know - thanks again.

Posted 1 year ago
Top
 



Topic Closed

This topic has been closed to new replies.