Warehouse Activity Profiling
Aims
In this assignment, we will:
- Analyze the warehousing activities of a real-world company to gain insights into its operations.
- Measure relevant key performance indicators (KPIs) to identify opportunities for improvement and discuss optimization strategies.
Learning Objectives
After completing this assignment, students will be able to:
- Recognize warehousing terminology related to material handling and storage systems.
- Analyze and clearly identify opportunities for improvement in warehouse operations.
- Present visual information in a professional and clear manner to facilitate decision-making.
- Address the computational complexities arising from:
- The large-scale nature of relevant applications.
- Imperfect data.
Study Case
In the following, you will find all relevant warehouse operation requirements and data from an office products’ wholesaler.
Source. This assignment was adapted from Bartholdi, III, J. J., & Hackman, S. T. For the original assignment and full details, please visit this link.
The Company
S. P. Richards Co. (SPR) distributes wholesale office products. Each distribution center (DC) accepts orders until around 5 PM (depending on the DC and the customer) and ships overnight for next-day delivery. The customers are known and stable, but their orders are not known until shortly before they must be loaded onto a scheduled truck for delivery.
SPR serves three main types of customers: “Mega-dealers,” independent resellers, and internet resellers (the latter of which is growing quickly). About 75 percent of orders received by SPR are “wrap-and-label,” which means they are packaged as if they came directly from SPR’s customers. See a presentation of the company here.
Data
The data below concerns the operations of the distribution center in Morristown, NJ, which serves the Philadelphia area, referred to as the “Philadelphia DC.”
“Data and data dictionaries provided by companies are often inconsistent, faulty, or incomplete. It’s your responsibility to develop a plan to clean and organize the data.
BCF-zones.xls: Warehouse map. Legend:
- The blue area is Zone B. The lighter blue is 30-inch shelving, and the darker (navy) blue is 18-inch shelving. The top shelf is reserved. (The reserve locations on the top shelf always use Shelf/Level = T and Position = T so that they are distinguishable in the database by the name.) The other locations are active picking locations.
- The purple area is Zone F reserve locations, and the gray is Zone F active picking locations. All of Zone F is pallet racking.
- The yellow area is pallet racking used to hold packing and shipping supplies.
- The red area is 18-inch shelving in Zone C, and the orange is 30-inch shelving in Zone C.
- The green area is Zone R. The lighter green represents 30-inch shelving, and the darker green represents 18-inch shelving.
- Other features of interest in the layout:
- The pickup station, where orders are released, is the first yellow bay from the right.
- The dropoff point where completed orders are checked and packed is at the black conveyors on the left.
- Each order-picker pushes a cart and travels a complete circuit through the main aisles of the zone, picking a batch of orders. Workers pick directly into final packaging, which is cardboard boxes. Any full-case quantities are pulled from reserve. Thus, if a customer requests 13 pens and the pens are packed 12 to a carton, there will be one pick from reserve (of 1 carton) plus one pick from the active pick area (of 1 pen).
- Most of the activity in Philadelphia is in zones B and C. Zone B is the blue shelving in the figure above, and this is the primary picking area. The cluster on the left is 30 inches deep and 42 inches wide. The cluster on the right is 18 inches deep and 42 inches wide. Zone B was intended to hold the fastest-moving of the small products. Zone C is the red shelving, some 18 inches deep, others 30 inches deep, and all 42 inches wide. Zone C was intended to hold slower-moving small products.
- Currently, zones B and C are picked independently of each other, and the two corresponding parts of any order are shipped separately. However, as this has not produced the hoped-for efficiencies, the order-picking for these two zones will be combined, so that each picker will travel through both zones if necessary.
- Zone F is pallet rack that has been configured to hold both full pallets and, in some locations, cartons. Zone F is the gray racking on the bottom left of the figure and also the purple racking that encircles zone B in a sort of horseshoe shape. The gray area of zone F is for active pick locations of products that are stored and picked as eaches, such as brooms or garbage cans. The purple region is intended to hold overstock for zones F and B but also holds some other items that are both stored and picked as eaches.
DC23CASES AS OF 050210.xls: Contains all the cases in reserve locations for zones A, B, C, R, F, W. Fields:
Case
: The unique case id for each box.SKU
: The unique SKU in the box. Mixed cases in reserve are not allowed. So there should only be one record per case ID. (Please report if you find otherwise.)Quantity
: This is the quantity in the box (in selling units). If the quantity is equal to the standard case quantity (which will be provided in another file), then the reserve location has a case ID on every box. This is what SPR refers to as a “case reserve” location. If the quantity is a multiple of the standard case quantity, then the reserve location has just one case ID for an entire pallet. This is what SPR refers to as a “pallet reserve” location. If it is not either, then someone has put a partial case in reserve, which they are not supposed to do, or the standard case quantity is not correct.Zone, Aisle, Bay, Level (shelf), Position
: The definition of the reserve location.Location Type
: This also indicates whether a location is a Case Reserve or Pallet Reserve location. BLU and KCN are codes that refer to a Case Reserve location, with a case ID on every box. PLC and PLP are Pallet Reserve locations, where pallet quantities are stored with a case ID for the entire pallet quantity. Note that for PLC locations the users can pull boxes off a pallet, just reducing the amount assigned to the case number. For PLP locations, we pull only full pallets. There are no broken pallet quantities.
DC23ACTIVE AS OF 050210.xls. Contains all of the active locations in all zones.
- Fields should be self-explanatory.
- The quantity is the current quantity, in selling units, that exists in the active location as of 01 February 2005.
Phily_Dim_Status2.xls: Contains status and dimensions of the SKUs. Fields:
Stocked
: Possible values are Y (yes), N (no and may be ignored), or X (new item that will be stocked but has no sales history).OMS status
: Possible values are A (active), or D or X (discontinued) or P (purged). All SKUs marked D, X, or P may be ignored.SellingUOM
: Selling unit of measure.SelLen, SelWid, SelHgt, SelWgt
: Length, width, height, weight of the selling unit in inches and pounds.ShipSelling
: Possible values are Y (Selling UOM is a corrugated box that can be labeled and shipped without repacking).Inner1Qty
: The number of SellingUOM’s in an inner pack.Inner1Len, Inner1Wid, Inner1Hgt, Inner1Wgt
: Length, width, height, and weight of the inner pack, in inches and pounds.Case Qty
: The number of SellingUOM’s in a case.CaseLen, CaseWid, CaseHgt, CaseWgt
: Length, width, height, and weight of a case, in inches and pounds.
ITEMDATAV2.txt: Additional SKU data in the form of a text file. Fields:
Vendor ID
(From whom the item is purchased)Hazmat Code
(Whether item is hazardous and so requires special handling. May be ignored)Selling UOM
Stocking Status
Item Status
Selling UOM Length
,Width and Height
(in inches)Case UOM Length
,Width and Height
(in inches)Selling UOM/Case
Quantity Ordered
Lines Ordered
Max Order Size
Average Order Size
Max Case Order
Average Case Order
Case Lines Ordered
Max Order No Case
Average Order No Case
- The last columns display amounts on-hand. There are two columns for each month. The AVG column is the average of the inventory levels over all days of the month. For example, “
janavg
” gives the average inventory levels over the days of January 2004. The OH is the inventory level on the last day of the month. For example, “janoh” gives the inventory level on 31 January 2004.
DC23MINMAX2.xls: A spreadsheet containing “min” and “max” values for the SKU in each active pick location:
- The warehouse management system triggers a replenishment to any location for which the number of pieces (eaches) has fallen to the min value; and the replenishment is sufficient to bring the inventory level at that location up to the max level.
dc23Sales04.txt: A sales history (text file; 59MB) containing the following fields:
item
: Unique SKU ID.vnid
: Vendor ID.itdesc
: Text description of SKU.ordnbr
: Unique ID of this order.linenbr
: Number of the line in this order.cusnbr
: Unique ID of customer.ordqty
: How many selling units ordered.shipqty
: How many selling units shipped (may be less than ordered if, for example, out of stock. Should never be more.)uom
: Selling unit-of-measure.txndate
: Order date and time.
TRCART23.txt: A snapshot of the batches that were formed by the warehouse management system for assignment to a particular truck (picking cart).
- Fields:
Truck Nbr (Unique Batch Number)
Area
(This field should be blank. Ignore it.)Zone
Aisle
Bay
Level
Position
Style
Style Sfx
Color
Carton Line Nbr
Ctn/Pkt qty
(Quantity to be picked for this carton and line)Carton#
(Carton ID this line will be picked for)Pickticket Control Number
Pkt Line Nbr
Print Wave
Date Created
Time Created
- Only active cartons are batched by the warehouse management system, so there will be no corresponding records in TRCART23 for the reserve cartons.
- Occasionally, multiple lines in a carton may have the same SKU, reflecting customer orders that SPR preserves.
- All cartons are listed in the following two carton tables, including reserve cartons:
- CDCART23.txt:
Pickticket Control Number
(aka order number)Carton#
(Unique number for each shipping carton)Carton Line Nbr
StyleField1 of SKU
Style SfxField 2 of SKU
ColorField 3 of SKU
To Be Packed Units
(Units called for in this box)Units Packed
(Units packed in this box, may be less if there was a shortage)Zone
(Location from which the item is to be pulled)Aisle
Bay
Level
Position
Date Created
(Date and time this record was created in the database - indicates when the order was waved and picking documents were printed)Time Created
- CHCART23.txt:
Carton#
Pickticket Control Number
Print Wave
(The wave number for the wave this carton was printed with)Pack Code
(This field is ‘01’ if the carton was from reserve, and blank if it was from active)Truck Plan ID
Date Created
Time Created
- CDCART23.txt:
- Fields:
Problem Description
The company intends to gain insights on its current operations and identify performance bottlenecks. Analyses on the the company’s warehousing activities include (but are not limited to):
- SKU and Vendor Analysis:
- Which SKUs have the highest turnover rates in each zone?
- Which vendors supply the most frequently picked SKUs?
- How is warehouse space allocated to different SKUs and vendors in terms of square footage?
- Order and Line Distribution:
- What is the average number of lines per order in zones B and C compared to other zones?
- How does the line-per-order ratio vary during peak and off-peak seasons?
- Picking Activity:
- What proportion of total picks is performed in zones B and C relative to the entire warehouse?
- How many picks are fulfilled from reserve locations versus active picking locations?
- What are the average picking times in reserve locations versus active locations?
- SKU Location Strategy:
- Which SKUs are stored in reserve locations and which are in active picking areas?
- In which zones are reserve locations predominantly found?
- How frequently are items moved from reserve to active locations?
- Efficiency and Productivity:
- Which zones demonstrate the highest efficiency in terms of picks per hour?
- What is the impact of SKU placement on picking efficiency in each zone?
- Inventory Management:
- What is the frequency of stockouts for high-demand SKUs in different zones?
- How does the replenishment rate compare between reserve and active locations?
- What trends can be observed in inventory levels over time for key SKUs?
- Space Utilization:
- What percentage of warehouse space is underutilized or overutilized?
- How does the physical space devoted to each vendor correlate with their sales contribution?
To facilitate communication with the warehousing team and management and help to highlight opportunities for improvement, you are required to present these analyses in the form of activity profiles.
TASK 1. Activity Profiles
Create 7 activity profiles from the four major categories:
- Heatmap (1 profile)
- Customer Order Profiles (2 profiles)
- Item Activity Profiles (2 profiles)
- Inventory Profiles (2 profiles)
For each proposed profile:
- Present a summary of main assumptions and the key findings.
- Propose an adequate strategy to address the opportunity for improvement.
- Present the code leading to the profile plot.
For example, suppose a Customer Pareto shows that just three customers make up 50 percent of the client’s distribution center activity. Your summary could be as follows:
A small minority of customers generate a large majority of picking and shipping activity. From the 1,000 customers, three customers, namely, Customer 1, Customer 2, and Customer 3, make up 50 percent of the of the client’s distribution center picking activity (measured as the percentage of the total number of picks in year 2005). Assigning customers to their own zones in the warehouse may reduce travel time and improve customer service.
A small minority of customers generate a large majority of picking and shipping activity. From the 1,000 customers, three customers, namely, Customer 1, Customer 2, and Customer 3, make up 50 percent of the of the client’s distribution center picking activity (measured as the percentage of the total number of picks in year 2005). Assigning customers to their own zones in the warehouse may reduce travel time and improve customer service.
Activity Profile Categories
Main activity profile categories by Frazelle (2016):
Heatmap
Customer Order Profiles
- 2.1. Customer paretos
- 2.2. Composition profiles
- 2.2.1. Family-mix profile
- 2.2.2. Handling-mix profile
- 2.2.3. Order-increment profile
- 2.3. Customer order volumetric profiles
- 2.3.1. Lines-per-order profile
- 2.3.2. Cube-per-order profile
- 2.3.3. Lines- and-cube-per-order profile
- 2.4. Calendar-clock profiles
- 2.4.1. Month-of-year profile
- 2.4.2. Week-of-month activity distribution
- 2.4.3. Day-of-week activity distribution
- 2.4.4. Hour-of-day activity distribution
Item Activity Profile
- 3.1. Item popularity profile
- 3.2. Cube-movement profile
- 3.3. Popularity-cube-movement profile
- 3.4. Item-order-completion profile
- 3.5. Demand-correlation profile
- 3.6. Demand-variability profile
Inventory Profile
- 4.1. Item-family inventory profile
- 4.2. Handling unit inventory profile
TASK 2. KPI Assessment
Assess 5 KPIs from the three major dimensions:
- Time (1 KPI)
- Quality (2 KPIs)
- Productivity (2 KPIs)
Refer to Staudt et al. (2015) for details on how the KPIs are calculated. For each KPI:
- Present a summary of main assumptions to calculate the KPI.
- Discuss key findings and opportunities for improvement.
- Present the code to perform the KPI calculation.
Use your creativity to “fill in the gaps”: if a KPI cannot be directly defined from the provided data, you may make reasonable assumptions based on, for example, best practices, industry standards, or averages. Missing data is a common challenge. Overcoming this is part of the assignment and may require innovative thinking and a willingness to explore alternative data sources or proxy measures. As long as you back up your claims and assumptions with data and make a plausible argument, that’s perfectly fine.
Warehouse KPIs per Dimension
List of KPIs by dimension according to Staudt et al. (2015):
Time
- Order lead time
- Receiving time
- Order picking time
- Delivery Lead time
- Queuing time
- Putaway time
- Shipping time
- Dock-to-stock time
- Equipment downtime
Quality
- On-time delivery
- Customer satisfaction
- Order fill rate
- Physical inventory accuracy
- Stock-out rate
- Storage accuracy
- Picking accuracy
- Shipping accuracy
- Delivery accuracy
- Perfect orders
- Scrap rate
- Orders shipped on time
- Cargo damage rate
Productivity
- Labour productivity
- Throughput
- Shipping productivity
- Transport utilization
- Warehouse utilization
- Picking productivity
- Inventory space utilization
- Outbound space utilization
- Receiving productivity
- Turnover
Requirements
Template Document: This assignment shall be implemented in a Python Notebook using this template.
Answers comprise the following elements:
- Descriptive title of the analysis (change the three-level
###
headings in the template underI) Profiles
andII) KPIs
) - Summary of the key findings and main assumptions. All summaries must be written in the style of a caption, following best practices (see how to write standalone captions).
- Code to perform the analysis, which has to be explained and commented on appropriately.
- Figures and/or tables featuring the results of the analysis (directly generated by the code cell). All figures and tables must follow best practices (watch the lecture on Tables and Figures (Coursera) starting at minute 4:42).
Tables and figures in a scientific manuscript are essential as they lay out the core narrative. It’s crucial for these elements to be self-explanatory and independent, enabling readers to understand the data without referring back to the main text. They should clearly define any acronyms and experimental details, convey a coherent story, and each should have a distinct, clear point. Often, these elements are the first point of contact with your work: readers typically skim through them to get a sense of what your research is about. The quality of these elements is frequently used as a heuristic to judge the quality of the research.
Assessment Criteria
Each activity profile and KPI will be evaluated using a 3-point rubric scale:
- 3 (Very Good) = 100%
- 2 (Satisfactory) = 50%
- 1 (Needs improvement) = 25%
- 0 (No submission) = 0%
Refer to Table 9.2 for activity profiles and Table 9.3 for KPIs to see what each score represents.
Use the rubrics as a guide when developing your own work, and make sure every criterion is addressed thoroughly.
Peer Review
Peer review is part of this assignment. You will evaluate two classmates’ submissions using the scoring matrix in Table 9.1. For each activity profile and KPI, assign a score from 1 to 3 and provide justification based on the evidence presented in the analysis.
Label | Category | Weight | Score | Feedback |
---|---|---|---|---|
AP1 | Heatmap | 15% | ||
AP2 | Customer Order | 10% | ||
AP3 | Customer Order | 10% | ||
AP4 | Item Activity | 10% | ||
AP5 | Item Activity | 10% | ||
AP6 | Inventory | 10% | ||
AP7 | Inventory | 10% | ||
KPI1 | Time | 5% | ||
KPI2 | Quality | 5% | ||
KPI3 | Quality | 5% | ||
KPI4 | Productivity | 5% | ||
KPI5 | Productivity | 5% |
Evaluation Rules
- Each student is assigned two reviews.
- Completing 2 reviews = full points.
- Completing 1 review = half points.
- Completing 0 reviews = no points.
- If a submission has fewer than three reviews, the instructor will assign additional reviewers or provide an instructor review.
- Repair assignments are capped at a maximum grade of 5.5 and are graded directly by the instructor.
- The instructor may randomly audit reviews for fairness.
- Students may appeal reviews within 72 hours of receiving them. Appeals must reference specific rubric rows.
Rubrics
Criterion | 3 – Very Good | 2 – Satisfactory | 1 – Needs improvement |
---|---|---|---|
Context & scope | States period, dataset size, and units; defines terms used in the caption. | Mentions some context, but key elements missing. | Little or no context; unclear scope or units. |
Assumptions & method | Assumptions are plausible, concise, and in-code; data prep and grouping rules are summarized precisely. | Assumptions partly justified or described outside code; some ambiguity. | Assumptions speculative or missing; code and text diverge. |
Quantitative precision | Uses percentages and exact counts; avoids vague words; shows summary stats when relevant. | Some quantification, but several vague claims. | Mostly qualitative or vague claims. |
Visualization quality | Axes labeled with units; number type, scale, and precision appropriate: IDs shown as integers, counts with thousand separators, rates as percentages with sensible decimals, units explicit; rounding does not change interpretation; scientific notation only when needed. Chart type fits task; heatmaps reflect physical layout; outliers handled and explained. | Minor labeling or scaling issues; some inconsistency in decimals or separators; choice of chart mostly ok; outliers noted but not treated. | Labels missing or misleading; inappropriate numeric types or precision; pie or donut used where bar would be clearer; heatmaps distort layout; outliers skew results. |
Insight & improvement strategy | Findings are specific, supported by numbers, and lead to feasible warehouse actions tied to context. | Findings generally sensible, with limited quantification or generic actions. | Speculative or generic advice not grounded in data. |
Writing & brevity | Caption is concise and standalone; no filler; proofreading evident. | Mostly clear but wordy; minor language issues. | Redundant, verbose; noticeable grammar issues. |
Criterion | 3 – Very Good | 2 – Satisfactory | 1 – Needs improvement |
---|---|---|---|
Definition alignment | KPI is correctly defined and matched to available fields and level of aggregation (e.g., order, order-line, shipment, SKU-location). | Mostly correct, with small mismatches or imprecise wording. | Misdefined KPI or mismatched data level of aggregation (e.g., mixing orders and order-lines). |
Calculation accuracy | Method is correct, reproducible, and briefly described in-code; edge cases handled. | Minor formula or level of aggregation issues; reproducibility mostly ok. | Significant errors in calculation or logic. |
Context & benchmarks | Time window, N of records, and units reported; benchmark or threshold justified by source or stakeholder rationale. | Some context; benchmark asserted but weakly justified. | No context; arbitrary thresholds. |
Visualization and reporting | Simple, fit-for-purpose visuals or a single number; number type, scale, and precision appropriate: correct units and denominators, sensible decimals, thousand separators, rounding that does not distort meaning. | Minor clarity or formatting inconsistencies. | Misleading rounding, missing or incorrect units, mixed scales, or inappropriate numeric types. |
Insight & opportunity | Interprets KPI drivers quantitatively and proposes feasible actions; avoids speculation that data could resolve. | Some interpretation; actions somewhat generic. | Speculation or unjustified causal claims. |
Writing & brevity | Direct, proofread, caption stands alone. | Minor verbosity or typos. | Redundant, verbose, or confusing. |
Submission
Upon finishing, upload on Canvas assignment page:
- The group iPython Notebook.
- A frozen version of the Notebook to ensure the submission remains unchanged. You can print the Notebook to PDF (
Ctrl + P
) and submit that version.
Useful Resources
- 5.1 Tables and Figures. Writing in the Sciences (Coursera).
- How to structure a data science project for correctness and reproducibility?
- How to write a figure caption.
- Overview of Google Colaboratory Features.
- Markdown Guide.
- Working with Jupyter Notebooks.
- 10 minutes to Pandas.
- Tip: Investigate Pandas DataFrame reading functions (e.g.,
read_fwf
,read_excel
,read_txt
) to work productively with data. - Tip: Use generative AI to create the base code for the charts and read data (e.g., from .csv, .xlsx, .txt). For example, you can prompt ChatGPT to write a script to create base plots and adjust according to your needs. Explore libraries such as Plotly, Matplotlib, and Seaborn.
FAQ
Q. The data is messy!
A. Yes. Real-world data is likely messy, faulty, redundant, ambiguous… Can you still extract insights from it?
Q. The heatmap is too hard to accomplish, there is not enough time.
A. Put things in perspective; this is a partial assignment with many exercises. Therefore:
- Identify the low-hanging fruits. Clean and analyze only the information you need to generate the profiles and calculate the KPIs you decide to show.
- Noting that the heatmap is the hardest exercise, leave it as the last one. All the others can be accomplished faster.
Q. Am I supposed to answer the questions under the problem description?
A. No. The questions are examples of what your profile assessment could answer. Investigating other questions is also possible.
Q. I could not find any costs in the data; how do I measure costs?
A. Can you derive costs based on assumptions? For example, based on the hourly rate of pickers, etc.
Q. I don’t know Python; where should I start?
A. This assignment can be accomplished using a subset of Python instructions pertaining to the Pandas library. Therefore, the easiest way to start is by understanding how to work with Python Notebooks (see the Google Colab Tutorial) and the “10 minutes to Pandas” tutorial.
Q. What is a wave?
A. In the warehousing context, “wave” refers to “wave picking,” an order picking method in which pickers work on “waves” (see Wave picking - Wikipedia).
Q. What is an order line?
A. An order line refers to a specific item in an order. Each product within an order has a distinct line associated with a unique order line number. For instance, order line 1 might represent 10 units of Product 1, while order line 2 represents 5 units of Product 2.
Q. How do I create the heatmap? The warehouse layout data is faulty!
A. Indeed, the layout is the least standardized information. It is part of the assignment to make reasonable assumptions based on the data you have: layout, photos (see the company presentation .ppt). Therefore, ask yourself first what you want to convey before spending too much time trying to create a perfect heatmap. For example, is it useful to show activity in a portion of the warehouse (e.g., a subset of zones)?
Still, be aware that some manual work is unavoidable; the layout, or part of it, should be transferred to a data structure you can work with programmatically.
TIP: Use your creativity to ask AI for code to generate and color the locations. These could be represented, for example, as points in a scatter plot or squares drawn on the plotting area.
Q. How do we simulate picking activity more precisely?
A. In the original material (see 4. Labor economics in Warehouse & Distribution Science (warehouse-science.com)Links to an external site.), you can find some data on averages for the picking time for each zone and what picking looks like (i.e., what each other picker is doing).
Q. The lead time seems high (several months). Is this OK?
A. It looks odd, indeed, but you can report that. It is more important that your handling of the data and the process you use to generate the KPI are correct. If, eventually, the input changes and the date differences between the Sales file and the CDC/TRC/CHC files are more reasonable, the KPI will be correctly calculated.
Q. Can I use external data for the heatmap? For example, a preprocessed Excel file?
A. You can use an Excel file where you manually added the coding by downloading it from your personal drive (as we do in the first cell). An alternative that does not require downloading is to save this file as CSV and copy the content to Python and save it as a variable. For example:
map_info_csv = """B-1-A; B-2-B ... B-1-B; B-2-B... B-1-C; B-2-B... """
The triple quotes let you store multi-line strings in variable
map_info_csv
.Q. Do item families have to follow ABC proportions (80%, 15%, 5%)?
A. No. ABC is one option, not a requirement. You may define families in ways that reveal useful patterns, for example:
- Shared keywords in product names or descriptions (look for comma-separated tags).
- Existing categories such as hazmat, temperature controlled, bulky, fragile.
- Brand, size, form factor, or storage mode.
Whatever you choose, document the rule clearly and show a few examples of the mapping.
Q. Which inventory metric should I plot for the item-family profile?
A. Use the metric that best tells the story, and state it explicitly. If the raw field looks noisy or inconsistent, justify any cleaning or filtering you apply.
Q. What time granularity should I use (month, quarter, year)?
A. Choose what makes the pattern legible and relevant. A good workflow is to explore weekly or monthly, then present the period that offers the clearest insight. You may include one overview chart per year plus a zoomed-in month if that helps interpretation.
Q. Can I estimate shipping productivity and throughput without explicit shipping times?
A. Yes, with stated assumptions. The dataset contains WMS wave creation times. You can model a lag from wave creation to pick start, travel, staging, and load. Choose a reasonable constant or rule, justify it, and test sensitivity.
Q. Some
.txt
files have no header row. How should I parse them?A. This is expected. Treat it as real-world (challenging!) data cleaning task. If you are stuck, check the notebook: Parsing TRCART23.csv Data
Q. In
itemdatav2.txt
, what does “quantity ordered” represent? It seems inconsistent with monthly inventory. A. The key is to curate fields that make sense and be explicit about assumptions or exclusions. If you discover discrepancies: 1. Drop the field from analysis. 2. Keep the field but add a clear note to the report explaining the discrepancy.Q. I am having problems to convert the .ipynb file to .pdf.
A. Try the following:
Export the notebook as HTML:
File > Download > Download .html
and print the HTML file to PDF using your browser’s print function (Ctrl + P
).Follow the instructions below to install pandoc, LaTeX, and nbconvert to convert the notebook directly to PDF from the command line:
Install Jupyter nbconvert:
pip install nbconvert
Install a LaTeX distribution (e.g., TeX Live, MiKTeX) and make sure you update it so it includes common packages.
Install Pandoc: Pandoc installation guide
Convert the notebook to PDF using the command line:
jupyter nbconvert --to pdf your_notebook.ipynb
If you want to remove the code cells from the PDF, you can use the
--no-input
flag:jupyter nbconvert --to pdf your_notebook.ipynb --no-input
Instead of installing a full LaTeX distribution, you can use
nbconvert
to export the notebook to PDF via a web PDF renderer. This requires installing Playwright and Chromium:python -m pip install -U "nbconvert[webpdf]" playwright python -m playwright install chromium jupyter nbconvert ".\wh-as_activity-profiling-template.ipynb" --to webpdf --allow-chromium-download --output-dir "C:\Temp" --debug