Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn a 50% discount on the DP-600 certification exam by completing the Fabric 30 Days to Learn It challenge.

Reply
RobGer
Frequent Visitor

Transform CSV Files on a monthly basis

I have the following issue regarding transforming csv files that are sent to me monthly.

The files are named after the YearMonth and the analyzed Subject.

Example: 202404 – Web traffic.

 

The file has the following format:

1st row: Title of the 1st table.

2nd row: Column names for that table.

3rd row till last row of the table: Content of that table.

“Empty Row”

After the empty Row comes another table, following the same format as above (Title, column names and content, followed by another empty row).

 

The title of each table stays the same each month, but I can´t somehow come with a solution that allows me to combine the monthly csv files.

The expected result would be, a query for each “table” that has a column with the source name of the file, so that I can later filter by date.

Each Query would be called after the title of each table, example:

Query Besucherüberblick

SourceName

nb_uniq_visitors

nb_visits

nb_actions

max_actions

nb_actions_per_visit

avg_time_on_site

bounce_rate

202403

„Data for 202403“

202404

898

1128

3227

46

2.9

00:02:35

50 %

 

Sample Data for one month: https://we.tl/t-0buiEhFDdS

 

Note: I use the following encoding to translate the file into power BI:

= Csv.Document(File.Contents("File Path"),[Delimiter=",", Columns=16, Encoding=65001, QuoteStyle=QuoteStyle.None])

 

Is this even possible? Thanks in advance!

1 ACCEPTED SOLUTION
AlienSx
Super User
Super User

This query reads all files in specified folder and grabs all tables. Replace full_folder_path with your own and give it a try.  

let
    file_processing = (b as binary) as table =>
        [s = Lines.FromBinary(b, QuoteStyle.None, false, 65001),
        rm = Table.FromColumns({List.RemoveItems(s, {"", "message", "ok"})}, {"table_name"}),
        group = Table.Group(
            rm, "table_name", 
            {"data", (x) => Table.PromoteHeaders(
                Table.FromList(List.Skip(x[table_name]))
            )}, 
            GroupKind.Local, 
            (s, c) => Number.From(not Text.Contains(c, ","))
        )][group],
    files = Folder.Files("full_folder_path")[[Name], [Content]],
    tra = Table.TransformColumns(files, {"Content", file_processing}),
    xpand = Table.ExpandTableColumn(tra, "Content", {"table_name", "data"})
in
    xpand

qry.jpg

 

 

 

 

 

 

 

 

Then you may create a reference to this query, filter by table name and expand data column to get what you want.

View solution in original post

2 REPLIES 2
AlienSx
Super User
Super User

This query reads all files in specified folder and grabs all tables. Replace full_folder_path with your own and give it a try.  

let
    file_processing = (b as binary) as table =>
        [s = Lines.FromBinary(b, QuoteStyle.None, false, 65001),
        rm = Table.FromColumns({List.RemoveItems(s, {"", "message", "ok"})}, {"table_name"}),
        group = Table.Group(
            rm, "table_name", 
            {"data", (x) => Table.PromoteHeaders(
                Table.FromList(List.Skip(x[table_name]))
            )}, 
            GroupKind.Local, 
            (s, c) => Number.From(not Text.Contains(c, ","))
        )][group],
    files = Folder.Files("full_folder_path")[[Name], [Content]],
    tra = Table.TransformColumns(files, {"Content", file_processing}),
    xpand = Table.ExpandTableColumn(tra, "Content", {"table_name", "data"})
in
    xpand

qry.jpg

 

 

 

 

 

 

 

 

Then you may create a reference to this query, filter by table name and expand data column to get what you want.

Thanks a lot! This has been eating at me for the past weeks, the solution worked perfectly!

Helpful resources

Announcements
LearnSurvey

Fabric certifications survey

Certification feedback opportunity for the community.

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors
Top Kudoed Authors